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Abstract 

This  report  presents  the  theoretical  development, 
evaluation,  and  applications  of  a  new  nonparametric  family 
of  continuous,  differentiable,  sample  distribution  func¬ 
tions.  Given  a  random  sample  of  independent,  identically 
distributed,  random  variables,  estimators  are  constructed 
which  converge  uniformly  to  the  underlying  distribution. 

A  smoothing  routine  is  proposed  which  preserves  the  dis¬ 
tribution  function  properties  of  the  estimators.  Using 
mean  integrated  square  error  as  a  criterion,  the  new  esti¬ 
mators  are  shown  to  compare  favorably  against  the  empirical 
distribution  function.  As  density  estimators,  their 
derivatives  are  shown  to  be  competitive  with  other  con¬ 
tinuous  approximations.  Numerous  graphical  examples  are 
given.  New  goodness  of  fit  tests  for  the  normal  and 
extreme  value  distributions  are  proposed  based  on  the  new 
estimators.  Eight  new  goodness  of  fit  statistics  are 
developed.  Extensive  Monte  Carlo  studies  are  conducted  to 
determine  the  critical  values  and  powers  for  tests  when  the 
null  hypothesis  is  completely  specified  and  when  the 
parameters  of  the  null  hypothesis  are  estimated.  These 
tests  were  shown  to  be  comparable  with  or  superior  to  tests 
currently  used.^  Forty-eight  new  estimators  of  the  location 
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parameter  of  a  symmetric  distribution  are  proposed  based 
on  the  new  models.  For  mild  deviations  from  the  normal 
distribution,  some  new  estimators  are  shown  to  be  superior 
to  established  robust  estimators.  Robust  characteristics 
of  the  new  estimators  are  discussed. 


xiv 


NONPARAMETRIC  ESTIMATION  OF  DISTRIBUTION  AND 
DENSITY  FUNCTIONS  WITH  APPLICATIONS 

I.  Introduction 

This  dissertation  develops  and  evaluates  new  non- 
parametric  techniques  for  use  in  data  analysis.  A  new 
family  of  nonparametric,  continuous,  differentiable  sample 
distribution  functions  is  proposed  to  model  univariate 
random  variables  with  continuous,  unimodal  densities.  Much 
of  the  motivation  for  this  research  effort  was  the  dominance 
of  the  empirical  distribution  function  (EDF)  as  a  basis  for 
goodness  of  fit  tests  and  robust  estimation  of  parameters. 
This  research  presents  a  continuous,  differentiable  alterna¬ 
tive  to  the  EDF  and  its  applications  to  statistical  infer¬ 
ence. 

The  EDF  has  long  served  as  the  mainstay  for  sta¬ 
tistical  inference.  Only  recently,  as  in  a  paper  by  Green 
and  Hegazy,  have  other  sample  distribution  functions  even 
been  considered  as  bases  for  goodness  of  fit  tests 
(Ref  29) .  These  alternatives  are  still  classical  step 
functions  and  are  shown  to  generate  powerful  goodness  of 
fit  tests.  The  authors  of  the  Princeton  study  on  robust 
estimation  of  a  location  parameter,  while  using  the  EDF 


exclusively  in  their  estimators,  are  careful  to  point  out: 
"We  ought  not  to  close  our  eyes  to  other  definitions  of  the 
empirical  cumulative"  (Ref  5:225).  Their  results,  using 
the  EDF,  have  given  a  large  impetus  to  the  search  for 
robust  estimators.  Should  not,  then,  a  continuous,  dif¬ 
ferentiable,  alternative  to  the  EDF  offer  the  potential 
for  improvement  in  goodness  of  fit  testing  and  robust 
parameter  estimation?  This  investigation  shows  that  the 
new  nonparametric  family  is  a  powerful  tool  for  modeling 
univariate  random  variables,  for  goodness  of  fit  tests  and 
for  robust  estimation  of  the  location  parameter  of  a 
symmetric  distribution. 

Our  analysis  begins  with  the  historical  background 
of  sample  distribution  functions  given  in  Chapter  II. 
Plotting  positions  for  random  samples  and  their  relation¬ 
ship  to  sample  distribution  functions  are  discussed. 

Chapter  III  presents  the  theoretical  development  of  the  new 
family  of  nonparametric  distribution  functions.  We  demon¬ 
strate  that  the  properties  of  a  distribution  function  are 
preserved  and  discuss  the  conditions  for  uniform  conver¬ 
gence.  A  routine  is  proposed  to  generate  a  smoother 
approximation  for  both  the  distribution  and  density  func¬ 
tions.  Six  specific  nonparametric  models  are  generated 
from  the  new  family  and  used  for  the  remainder  of  the 
analysis.  Three  of  these  models  are  adaptive  based  on  the 
estimated  tail  length  of  the  underlying  distribution  from 
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a  random  sample.  Chapter  IV  examines  the  literature  for 
techniques  of  distribution  and  density  function  estima¬ 
tion.  A  Monte  Carlo  analysis  is  then  conducted  to  compare 
the  distribution  and  density  function  estimates  using  mean 
integrated  square  error  as  the  criterion.  While  not  spe¬ 
cifically  designed  as  density  function  estimates,  the  new 
nonparametric  models  are  shown  to  be  competitive  with  or 
superior  to  two  other  continuous  density  function  esti¬ 
mates.  Several  examples  of  the  nonparametric  estimates 
are  graphically  displayed.  The  chapter  concludes  with  a 
discussion  of  a  continuous  nonparametric  estimation  of  the 
hazard  function  which  results  from  the  differentiability 
of  the  distribution  function  estimate.  Chapter  V  addresses 
the  goodness  of  fit  problem.  After  a  brief  historical 
survey,  we  propose  eight  new  goodness  of  fit  statistics. 

An  extensive  Monte  Carlo  analysis  is  conducted  to  determine 
the  critical  values  for  each  test  statistic  for  null  dis¬ 
tributions  which  are  completely  specified  and  when  param¬ 
eters  are  estimated.  Two  null  distributions,  the  normal 
and  the  extreme  value  distributions,  are  considered.  Sub¬ 
sequent  Monte  Carlo  power  studies  show  that  the  new  tests 
are  competitive  with  or  superior  to  certain  established 
goodness  of  fit  tests.  Chapter  VI  describes  techniques 
for  parameter  estimation  using  the  new  models.  Following 
a  brief  survey  of  location  parameter  estimation  and  robust¬ 
ness,  we  propose  forty-eight  new  estimators  of  the  location 
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parameter  of  a  symmetric  distribution.  The  estimators 
are  compared  with  the  sample  mean,  sample  median,  and 
certain  robust  estimates  proposed  by  Huber  and  Hampel. 

The  comparisons  are  made  using  standardized  empirical  vari¬ 
ances  determined  by  Monte  Carlo  simulation,  maximum  and 
average  relative  deficiencies,  and  robust  characteristics 
based  on  approximated  influence  curves  over  nine  alterna¬ 
tive  symmetric  distributions.  For  relatively  mild  devia¬ 
tions  from  the  normal  distribution,  certain  new  nonpara- 
metric  estimators  are  shown  to  have  smaller  deficiencies 
than  the  other  estimators  included  in  the  study.  The  final 
chapter  summarizes  the  major  results  of  this  research 
effort  and  also  indicates  potential  applications  of  the 
new  nonparametric  models.  We  conclude  with  a  discussion 
of  areas  for  future  research. 
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II .  Background 


Sample  Distribution  Functions  (SDFs) 

One  of  the  initial  steps  in  data  analysis  is  the 
formulation  of  a  sample  cumulative  distribution  function. 
The  most  common  of  these  is  the  empirical  distribution 
function  (EDF)  whose  properties  are  listed  in  Gibbons 
(Ref  27:73-75).  Let  Sn(x)  be  the  EDF. 


(0 
i/n 
1 


X(i)  1  X  <  X(i+1)  1-1 - 'n‘1 

X  ±X(n) 


It  is  easy  to  construct  other  sample  distribution 

functions  which  are  also  step  functions.  Let 

tgi>  i=l,...,n  be  a  nondecreasing  sequence  of  real  numbers 

on  [0,1]  with  g  =1.  Now  define 
’n 


Gn(x) 


X  <  X 


(1) 


x(i)  i  x  <  x(i+ll 
x  ±  X(n) 


i=l , . . . ,n-l 


Clearly  Gn(x)  possesses  all  of  the  properties  of  a  dis¬ 
tribution  function. 

However,  if  we  relax  the  property  that 

lim  G  (x)  =  0  or  lim  G  (x)  =  1,  we  get  improper  sample 
x-*--®  x-*=°  n 


distribution  functions.  An  example  is 

5 


0 


x  <  X 


(1) 

Gn(x)  =  i/(n+l)  X^j  1  x  <  X(i+1j  i=l,...,n-l 

n/  (n+1)  x  >.  X(n) 

It  can  be  easily  shown  that  the  improper  distribution 
function  just  defined  has  the  same  absolute  convergence 
properties  as  the  empirical  distribution  function.  At 
this  point,  let  us  defer  a  discussion  of  the  properties  of 
either  proper  or  improper  distribution  functions. 

Several  authors  have  considered  specific  alterna¬ 
tives  to  the  empirical  distribution  function.  In  choosing 
a  goodness  of  fit  criterion,  Pyke  used  the  mean  ranks  as 
the  basis  for  his  modified  empirical  distribution  function 
(Refs  10,68).  Vogt  also  considered  the  mean  ranks  in  his 
evaluation  of  maximal  deviations  from  the  EDF  and  his 
variant  of  the  EDF  (Ref  98).  In  a  goodness  of  fit  test 
for  a  completely  specified  continuous  symmetric  distribu¬ 
tion,  Schuster  proposes  an  unbiased  estimator  Gn(x)  as  the 
average  of  the  EDF  and  another  EDF  based  on  reflecting  the 
sample  about  the  center  of  symmetry  (Ref  82:1).  He  later 
considers  the  estimate  of  the  distribution  function  when 
the  center  of  symmetry  is  unknown.  For  a  suitable  choice 
of  an  estimator  of  the  center  of  symmetry,  it  can  be  shown 
that  the  estimate  formed  by  reflection  about  the  estimated 
center  of  symmetry  is  asymptotically  better  than  the  EDF 
in  specific  cases  (Ref  83) .  In  testing  for  symmetry, 
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Rothman  and  Woodroofe  required  their  sample  distribution 
function  to  be  invariant  under  the  transformation  x-*-x. 
Thus,  they  used  2F*  (x)  =  S  (x+)  +  S  (x  )  where  S  is  the 
EDF  (Ref  76).  Hill  and  Rao  generalized  this  sample  dis¬ 
tribution  function  in  another  article  investigating  the 
center  of  symmetry.  They  point  out  that  the  invariance 
property  is  preserved,  if  F*  is  replaced  by  F^a^  where 
0<a<l  and 

aF  (x+)  +  (l-a)F  (x  )  x<0 

n  n  — 

(l-a)Fn(x+)  +  aFn(x")  x>0 

for  center  of  symmetry  zero  (Ref  36). 

Forming  continuous  sample  distribution  functions 
is  a  simple  task.  Let  {X^}  i=l,...,n  be  an  ordered 
sample.  Choose  a  plotting  rule  for  the  {X^}  to  form  the 
set  of  plotted  values  (G(X^j)}  i=l,...,n.  A  linear  inter¬ 
polation  of  the  G(X^j)  values  for  each  interval 
[X(i)  »x^+2j  ^  gives  a  continuous  function  defined  on 
[X  (i)  ,X  (nj  ]  .•  If  G(X^j)=0  and  G(X^nj)=l,  then  the  function 
is  a  proper  distribution  function.  If  not,  we  can  con¬ 
struct  extrapolation  points  xj0)  and  X(n+u  such  that 
G(X(o))=0  and  G (X j ) =1 •  Linear  interpolation  based  on 
these  extrapolated  points  again  results  in  a  continuous 
proper  sample  distribution  function.  Spline  smoothing  or 
exponential  extrapolation  for  the  X^  and  ^(n+l)  P°fnts 
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are  two  other  methods  proposed  by  Andrews,  et  al . ,  for 
forming  alternatives  to  the  EDF  (Ref  5:224-225). 

Whether  we  use  a  step  function  or  a  continuous 
one,  the  values  of  the  sample  distribution  function  at  the 
observed  data  points  can  be  used  to  estimate  the  under¬ 
lying  cumulative  distribution  function.  The  next  section 
will  examine  several  choices  for  these  values,  their  use 
as  plotting  positions,  and  the  relationship  between  plot¬ 
ting  positions  and  sample  distribution  functions. 

Plotting  Positions 

Used  in  graphical  data  analysis,  plotting  positions 
represent  the  estimated  value  of  the  underlying  probabil¬ 
ity  distribution  function.  As  mentioned  earlier,  these 
plotting  positions  could  be  the  values  of  some  sample  dis¬ 
tribution  functions  at  the  observed  data  points. 

As  early  as  1930,  Hazen  recognized  that  the  values 
of  the  EDF  were  inappropriate  for  plotting  annual  flood 
data.  He  chose  the  midpoint  of  the  jumps  of  the  EDF  as 
his  plotting  position  (Ref  35).  A  limited  survey  comparing 
various  choices  of  plotting  positions  was  undertaken  by 
Kimball  (Ref  45) .  Some  choices  were  based  on  specific 
underlying  probability  distributions.  White  proposes 
plotting  positions  for  the  Weibull  distribution  based  on 
the  expected  value  of  reduced  log-Weibull  order  statistics 
(Ref  107)  .  For  the  normal  distribution,  Blom  suggests 
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plotting  the  ith  order  statistic  at  (i- . 375) / (n+ . 25) .  He 

argues  that  this  plotting  rule 

.  .  .  leads  to  a  practically  unbiased  estimate 
of  o  (the  shape  parameter)  with  a  mean  square  devia¬ 
tion  which  is  about  the  same  as  that  of  the  unbiased 
best  linear  estimate. 

He  also  states  that  Hazen's  choice  of  plotting  position 
for  the  normal  "...  leads  to  a  biased  estimate  of  a 
with  nearly  minimum  mean  square  deviation  about  o"  (Ref  7 ) . 
While  the  previous  discussion  concerned  some  isolated 
plotting  conventions,  we  now  examine  some  basic  systems 
of  plotting  positions. 

Rank  Distributions.  Let  XM,,...,X,  .be  an 
- ( i )  (n) 

ordered  sample  from  an  underlying  probability  distribution 
F(x).  The  distribution  of  F(X^j)  i=l,...,n  is  the  rank 
distribution.  It  can  be  shown  that  this  distribution  is 
a  beta  distribution  for  each  i  and  is  independent  of  the 
underlying  distribution  F,  so  long  as  F  is  differentiable 
(Refs  19,  44) .  A  plotting  position  for  the  ith  order  sta¬ 
tistic  can  be  thought  of  as  a  point  on  the  ith  rank  dis¬ 
tribution.  The  question  arises  as  to  what  point  on  the 
rank  distribution  should  be  used  as  a  representative 
choice  for  F(X^).  "  ee  measures  of  central  tendency, 

the  mean,  median,  and  mode,  are  all  contenders. 

E (F (X ( d ) )  =  i/(n+l),  the  mean  rank,  has  the  property  that 
it  divides  [0,1]  into  n+1  equally  probable  intervals.  The 
median  rank,  approximated  by  ( i- . 3) / (n+ .4) ,  can  be  used 


as  a  better  representative  of  skewed  distributions,  which 
most  rank  distributions  are.  For  a  unimodal  distribution 
the  mode  rank,  (i-l)/(n-l),  approximates  the  maximum  of 
the  probability  density  function  of  the  rank  distribution 
Thus,  the  selection  of  a  plotting  position  is  equivalent 
to  selecting  a  point  from  a  beta  distribution. 

Blom 1 s  Formula .  Plotting  positions  can  also  be 
derived  from  rather  general  expressions.  Given  choices 
of  a  and  8  such  that  a,  6<_1,  a  plotting  position,  G^,  can 
be  defined  as: 


G. 

1 


i-ot 

n-a-3+1 


For  specific  choices  of  a  and  3,  see  reference  7.  From 
the  above  formula,  one  can  easily  generate  the  same  plot¬ 
ting  positions  in  the  rank  distributions  by  judicious 
choices  of  a  and  3 • 

A  slightly  more  general  plotting  position  can  be 
defined  by 

G.  =  where  -l<a<8<l 

Once  again,  this  formula  allows  for  generation  of  common 
plotting  positions  by  correct  choices  of  a  and  3- 
Table  II. 1  summarizes  some  common  plotting  conventions. 
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TABLE  II.l 

PLOTTING  POSITIONS  OF  THE  ith  ORDER  STATISTIC 


Formula 

Description 

1. 

i/n 

value  of  the  empirical  distribution 
function 

2. 

i/(n+l) 

mean  rank 

3. 

( i-1) / (n-1) 

mode  rank 

4  . 

( i-  .  3)  /  (n+  .4) 

median  rank  (approximation) 

5. 

( i-  •  5)  /n 

midpoint  of  the  jump  of  the  empiri¬ 
cal  distribution  function 

6. 

tn(2i-l) -1] / (n2-l) 

average  of  the  mean  and  mode  ranks 

7. 

( i- - 375) / (n+ . 25) 

efficient  approximation  for  the 
normal  distribution 

8. 

( i-a) / (n-a-B+1) 
(a,6<l) 

Blom’s  general  plotting  position 

9. 

(i+a) / (n+6) 

-l<a<0<l 

a  more  general  plotting  position 

a  more  general  plotting  position 


While  the  choice  of  plotting  position  is  subject 
to  the  analyst's  discretion,  one  must  be  aware  of  the  prob 
lem  of  choosing  plotting  positions  and  generating  a  sample 
distribution  function  based  on  these  positions.  Once  a 
plotting  position  is  picked,  any  number  of  sample  distribu 
tion  functions  can  be  constructed.  However,  given  a 
specific  plotting  rule  (midpoint  of  the  jumps,  limit  from 
the  right,  etc.),  a  sample  distribution  step  function 
uniquely  determines  the  plotting  positions. 
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III.  New  Nonparametric  Sample  Distribution  Functions 

Introduction 

Having  already  seen  the  uses  of  various  discrete 
plotting  positions  and  their  relationship  to  sample  dis¬ 
tribution  step  functions,  we  now  propose  a  new  family  of 
approximations.  The  next  section  presents  the  theoretical 
development  of  a  family  of  nonparametric,  continuous,  dif¬ 
ferentiable  sample  distribution  functions.  Properties  of 
distribution  functions  are  preserved  and  uniform  conver¬ 
gence  is  demonstrated.  A  smoothing  routine  is  selected 
which  again  preserves  the  distribution  function  properties. 
Three  specific  nonparametric  models  are  developed  by  a 
detailed  analysis  of  the  stylized  and  random  samples  from 
selected  members  of  the  Generalized  Exponential  Power  dis¬ 
tribution.  Finally,  three  adaptive  nonparametric  models 
were  proposed  based  on  using  percentile  ratios  as  a  dis¬ 
criminant  . 

Theoretical  Development 

Consider  a  random  sample  X.  X  of  size  n  from 

±  r  •  •  •  9  n 

an  unknown  univariate,  continuous,  probability  distribution 
function  F.  Let  Xn.  X ,  .  be  the  ordered  sample.  Now 

let  -  G(X^j),  i=l,...,n,  be  the  plotting  position  for 
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the  ith  order  statistic  based  on  some  sample  distribution 
function  G. 

Our  goal  is  to  estimate  F  by  a  nonparametric 

approach  while  preserving  the  following  properties  of  the 

estimator,  F  : 

n 


1.  F  is  differentiable 

n 

2.  F  is  a  distribution  function 

n 

3*  Fn(X(i))  =  Gi'  1=1"“'n 

Linear  interpolation  will,  of  course,  satisfy  conditions 
2  and  3,  but  we  require  differentiability  at  the  data 
points.  What  is  needed  is  a  family  of  nondecreasing 
curves  on  [X^,  X^+^]  such  that 


lim  F' (x)  =  lim,  F'(x)  for  each  i=l,...,n 
x-X.~  n  x-*X . +  n 

l  l 


Arbitrarily,  set  the  derivative  equal  to  zero  at  each  data 
point.  Now,  consider  the  midpoint  of  the  interval 

[X(i) '  X ( i+1) ] '  Let 


n 


+X 


G .  +G . 


(i)  ( i+1) \  _  x' ui^l 


Consider  the  function  -a  cos  y,  which  is  monotoni- 
cally  increasing  on  the  interval  [0,  tt ]  where  a  is  a  con¬ 
stant.  Making  the  transformation 


y 


n 


■ ) 

x(i+l)“x(i)/ 
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yields 


G-+G  . 

F  (x)  =  1  ~  -  -  a  cos 

n  * 


ir  (x - — ) 

\x(i+l)  X(i )/ 


(3.1) 


Requiring  Fn(X(i))  =  Gi  for  each  i=l,...,n  gives 


a  = 


G  .  , .  ~G  . 
l+l  l 


Defining  extrapolation  points  X^  and  X(n+i)  suc^  that 
Gq  =  0  and  Gn+^  =  1  completes  the  derivation.  Thus, 
equation  3.1  becomes: 


x<X 


Vx>- 


Gi + 


G  .  ,  -G  , 
l+l 


-  ^l-COSTT^ 


0 
x-X 


(i) 


X(i+l)"X(i) 


(3.2) 


X(i)-X<X(i+l) 
X— Xn+1 


i=0, . . . ,n 


Differentiating,  one  immediately  obtains  an  esti¬ 
mate  of  the  probability  density  function. 


fn(x)  = 


(lL+1'%  -  -)  sin 

NX(i  +  l)_X(i)/  \X(i  +  l)' 


(i) 


(3.3) 


X(i)-x<X(i+l) ' 
elsewhere 


Clearly,  the  derived  Fn(x)  satisfies  the  three 
properties  required.  However,  the  utility  of  such  an  esti¬ 
mate  can  certainly  be  questioned  at  this  point. 
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Figures  3.1  and  3.2  show  the  estimates  of  the  cumulative 
and  density  functions  respectively  for  a  random  sample 
of  size  20  from  a  normal  distribution  with  zero  mean  and 
unit  variance.  The  plotting  positions  chosen  were  the 
average  of  the  mean  and  mode  ranks.  The  extrapolation 
points  and  were  chosen  as:  X^  =  2X^j  -  X^j 

and  X(n+i)  =  2*(n)  "  X(n-1)'  est^mate(3  CDF  does  approx¬ 

imate  the  true  CDF  in  a  continuous  fashion,  but  provides 
the  same  inferences  about  the  underlying  population  as 
the  plotting  positions  themselves.  The  estimated  PDF  plot 
is  analogous  to  a  histogram  with  the  intervals  chosen  to 
contain  only  one  data  point.  Some  shape  of  the  underlying 
density  can  be  inferred,  especially  with  larger  sample 
sizes,  but  any  inference  concerning  the  density  shape  or 
type  is  limited. 

The  basic  undesirable  property  in  the  development 
thus  far  has  been  the  zero  derivative  of  the  estimated 
cumulative  distribution  function  at  the  data  points.  To 
avoid  these  zero  derivatives,  consider  applying  a  variation 
of  the  jackknife.  This  technique  was  developed  by 
Quenouille  (Refs  70,71)  as  a  means  of  reducing  the  bias  of 
an  estimator.  In  an  abstract,  Tukey  proposes  using  the 
technique  for  robust  interval  estimation  (Ref  96).  An 
excellent  survey  and  bibliography  is  given  by  Miller 
(Ref  58) .  More  recent  applications  and  extensions  of 
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the  jackknife  may  be  found  in  Gray,  et  al.,  and  Cressie 
(Refs  15,28)  . 

Analogous  to  Quenouille's  development,  let 
X(l)  '  *  '  *  f*(n)  k0  an  or<3ered  sample.  Choose  k£n/ 2  to  be 
the  number  of  subsamples.  Beginning  at  form  the  sub¬ 

samples  by  assigning  each  successive  order  statistic  to  a 
new  subsample  until  the  k+1  order  statistic  is  reached. 
Repeat  this  assignment  process  beginning  with  this  order 
statistic,  using  the  same  ordering  of  subsamples,  until  all 
n  order  statistics  are  assigned. 

Mathematically,  if  k  is  the  number  of  subsamples, 
then  n=km+r  where  m=[n/k]  and  r=n  modulo  k.  Now  let  l 
index  the  subsamples,  £=l,...,k  and  let  y^  ^  be  the  jth 
element  of  sub sample  £.  Thus, 


Y(j,£)  X(£+k( j-1) ) 


where 

j=l , . . . ,m 

if 

l>  r 

j=l , . . . ,m+l 

if 

JUr 

Clearly,  there  will  be  k  ordered  subsamples,  r  of  which 
have  size  m+1  and  k-r  have  size  m. 

Returning  to  the  zero  derivative  problem,  now  that 
the  subsamples  are  generated,  consider  the  following  esti¬ 
mate  of  the  cumulative  distribution  function.  Form  k 

estimates,  SF. (x) ,  where  SF0 (x)  =  F  *(x)  for  £=l,...,k 

jc  >6  n 

and  Ffi* (x)  is  the  continuous,  differentiable,  sample 
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distribution  function  defined  in  equation  3.2  and 


in  if  ^ 

n*=  v,.  „ .  .  The  derivatives  SF„ (x)  are  zero  at  each 

m+1  £<_r  Si 

data  point  of  the  subsamples.  Now  simply  average  the_>e 

estimates  to  form  the  sample  cumulative  function, 

1  k 

SF  (x)  =  i  Z  SFp  (x)  (3.4) 

K  1= 1 


and  sample  density  function 


1  k 

sf  (x)  =  SF  (x)  -  £  I  SF.  (x) 

K  1=1 


(3.5) 


Note  that  each  of  the  subsamples  has  its  own 
extrapolated  points,  Yjq  ^  and  £) •  Now  let 


X  •  =  min  { Y . -  } 

min  p  ( 0  f ~ ) 


and 


Xmax  ~  mfx  {Y(n*+l,Jl)}  • 


Thus,  the  cumulative  and  density  functions  in  equations 
3.4  and  3.5  are  formally  defined  as: 


0 


x<X 


mxn 


SF (x)  = 


£  1  SFo<x) 

*  £=1  x 


X  .  <x<X 
mm—  —  max 


(3.6) 


x>X 


max 


sf(x)  = 


£  Z  SF'(x) 
K  1=1  x 


X  .  <x<X 
mm—  —  max 


elsewhere 


(3.7) 
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Two  important  results  occur  by  this  averaging. 

First,  while  we  required  that  F  (Y..  =  G.  for  each 

n  (j,M  3 

data  point  in  the  subsample,  SF(Y,.  „ . )  is  not  necessarily 
equal  to  the  G  (£+k  ( j-1)  )  ^or  the  ent;*-re  sample.  Thus,  we 
are  no  longer  tied  to  restricting  our  estimates  to  the 
plotting  positions  of  the  original  sample.  Second,  while 
each  SF'(Y^  n)^  =  SF'(Y^  £)  ^  =  ®  only  if  there  are  at 
least  k  data  points  identically  equal  to  Y, .  0, .  Since 
the  assumed  underlying  distribution  function  is  continuous, 
the  probability  of  such  an  event  is  zero.  Of  course,  in 
actual  data  sets,  due  to  measurement  accuracy,  this  event 
may  occur.  However,  since  it  would  require  k  occurrences 
in  the  same  random  sample  to  force  a  zero  derivative,  the 
limitation  does  not  appear  to  be  unreasonable.  Figures  3.3 
and  3.4  show  the  effect  of  averaging  on  the  normal  sample 
of  size  20  considered  previously.  The  number  of  subsamples, 
k,  was  chosen  as  four.  Both  the  distribution  and  density 
functions  are  beginning  to  identify  the  shape  of  the  under¬ 
lying  random  variable. 


Properties 

Now  that  we  have  defined  estimates  for  both  the 
cumulative  distribution  and  density  functions  by  equations 
3.6  and  3.7,  we  need  to  examine  their  properties.  Spe¬ 
cifically,  we  will  consider  the  distribution  function 
properties  and  uniform  convergence. 
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Figure  3.4.  Sample  PDF — 4  Subsamples  vs  N(0,1) 


Let  be  the  real  line,  $  the  borel  field  on  R"*" 
and  P,  a  probability  measure  defined  on  g.  The  function 
F  defined  on  (R'*',  8,  F)  by  F(x)  =  P({tcR^:  -°°<t£x>)  is  the 
distribution  function  of  F.  Any  standard  probability  text 
gives  the  properties  of  F  (see  references  13  and  49)  . 

F  satisfies  the  following  three  properties: 

1.  F  is  nondecreasing 

2.  F  is  continuous  from  the  right 

3.  lim  F(x)  =0  and  lim  F(x)  =  1 

X+— °o  x-*“ 

The  function  SF(x)  defined  in  equation  3.6  clearly  satis¬ 
fies  these  properties.  Further,  since  each  SF^(x)  is 
differentiable  for  each  xeR^,  SF(x)  is  also  differen¬ 
tiable  . 

To  examine  the  convergence  of  our  estimator  in 
equation  3.6,  we  begin  by  examining  the  convergence  of 
step  functions  for  subsamples. 

Theorem  3.1.  If  Sn*  is  a  sample  distribution 
function  based  on  a  subsample  of  the  form 

{ Y  ^  j=l»,*,f^*/  ^=l,...,Jc<'0C’, 

where  Y ,  .  „ .  =  X(0j,  ... 

(},£)  U+k(;j-l)) 

as  defined  in  the  previous  section,  and 

*  rm  if  £> r 
n  "  m+1  if  £<r 
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then  S  ,  (x)  converges  uniformly  to  F(x)  where 
n* 


finite 

Let 

where 

Now 


S  * (x)  = 
n* 


0 

j/n* 

1 


x<  Y 


(1,M 


Y<  j.M-x<y(  j+l,M 
X-Y<n*,l> 


j=l,  •  •  .  ,n* 


Proof.  Without  loss  of  generality,  let  F  have  a 
support  [a,  b]  in  . 


S„*(x)  -  F  (x) j  = I  -i  •  ?  S  (x)  -  F (x) 


D  =  SUp  ,  ~  - .  _  .  _ 

*  n*  n*  i  n 

~°°<  x<°°  1  ±  1 


S  (x)  is  the  EDF . 
n 


D<  sup  Is  (x)-F(x)|  +  SrJx) 

-  _oo<x<oo  n  \  n*i  )  n 


By  construction,  n=km+r,  i=£+k(j-l),  r<k,  and  l<_k<co  ■ 

For  simplicity,  consider  the  case  n*=m  (n*=m+l  is  similar 
with  slightly  more  algebra) . 


So, 


D£  sup 

— co<x<c 


Sn(x) -F(x) 


/m(£+k ( j-1) ) -j (km+n) \ 
\  m(£+k ( j-1) )  / 


Sn(x) 


£  sup  Is  (x)-F(x) 

_oo<x<°°  ^ 


lim  D  _<  lim 
n-w°  n-*°° 


D 

n 


+ 


sup 

_oo<x<od 


fl  _  \ i 

i _ i 

Vj  k 
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Case  i:  x=a 


n+®  implies  m-*-®  ,  j-*-l ,  Sn(x)-*-0 

Case  ii:  xe(a,b] 

n-*®  implies  m-*-®  ,  j-*-® 

Since  £<k<®  and  r<k<®,  and  since  P[lim  D  =0]  = 1  by 

n->®  n 

Glivenko's  Theorem  (Ref  73  :353),  P[lim  D  =0]  =1. 

n-*-® 

We  now  have  established  uniform  convergence  for 
sample  distribution  functions  based  on  our  constructed 
subsamples.  Let  us  consider  a  general  sample  distribution 
function  defined  on  these  subsamples.  We  will  continue  to 
use  n*=m. 


Theorem  3.2.  SF ^  (x)  converges  uniformly  to  F(x) 


where 


SF£  (x) 


x<Y , 


<1,A) 

,j+“,/,m+81  Y(j,*)-X<Y( j+i,o  i*1' 


,m 


X— Y (m+1 ,£ ) 


and  -l<a<B<l,  y  „  =  y  .  +  6 

-  (m+l,£)  (m,£) 


where  6-*-0  as  nr*-00 


Proof. 

SF~  <x> 


Six) 

m 

?  S  (x) 
3  m 

S(x) 

m 


x<  Y 


(1,4) 


Y(j,t)<-X<Y 


x— Y (m+1 ,1) 


(j+1,4) 
j=l, • • 


,m 
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Now  let  D*  =  sup  |SF  (x)  -  F (x) | 
n  — co<  x<°°  x 


<D  + 
—  n 


Again,  if  x  is  an  interior  point  or  an  end  point  the  second 
term  approaches  zero  as  n-*-°°  .  Thus,  by  Theorem  3.1 

Pllim  Dn*  =  0]  =  1 
n-*-°° 


A  slight  modification  of  the  hypothesis  of 
Theorem  3.2  gives  another  family  of  estimators  which  con¬ 
verge  uniformly  to  F(x).  The  proof  of  the  following 
theorem  is  similar  and  thus  omitted. 


where 


Theorem  3.3.  SF+(x)  converges  uniformly  to  F(x) 


SF  *(x) 


0 

j+l+g 

m+B 


x<Y(o,u 

Y(j,.t>±X<Y(j+l,l)  3-0,1 

x±Y(m,») 


,m-l 


and  x<a<6<l,  « 

where  6-*-0  as  m-*-<». 

We  now  have,  by  the  previous  two  theorems,  two 
families  of  sequences  of  estimators  which  converge  uni¬ 
formly  to  the  underlying  probability  distribution 


27 


function  F(x).  Now  consider  SF^  (x)  as  derived  in  the  pre¬ 
vious  section  and  define  G^  =  SF^  (Y^  ^  for  j=0 , 1 , . . . ,m+l . 


Thus 


—  SF£  (Y ( j  for  j— 0,1, ...,m 


since  SF^  "  SFs,  (Y(  > 


We  know  by  construction  that 


SFP  (x)  <  SF «  (x)  <  SFp  (x)  for  every  x. 


This  implies  that 


lim  sup  | SF  (x)  -  F(x)| 

-t»<x<°° 

<_  lim  sup  |SF  (x)  -F(x)  |<  lim  sup  |SF+(x)-F(x) 
—  °°< x<°°  n->°°  -o°<x<°° 


From  Theorems  3.2  and  3.3,  we  can  summarize  with 
the  following  theorem. 


Theorem  3.4.  SF^(x)  converges  uniformly  to  F(x) 


where 


(0,4) 


SF^ (x)  = 


Gj  + 

J  2 


■cos  ir  ^ Y  ( j+i 


^ - )) 

,4)  *<j  ,1))/ 


Y( j,4)-X<Y( j+1,4) 


j=0 ,  1 ,  •  •  •/in 


X— Y (m+1 , 4) 
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dnd  Gj  — ■  G(Y^j  jl)  ^  f  -3  9  ^  9  •  *  •  *  ^ 

where 

0  x<Y(l,M 

(j+o)/(m+0)  Y(j>jL)<x<Y(j+lfJl)  j-1 . m 

1  X-Y<m,f) 

for  -l£a£3£l 

To  prove  our  final  result,  we  need  a  lemma. 

Lemma  3.  S.  A  finite  convex  combination  of  esti¬ 
mators  which  converge  uniformly  to  F(x)  also  converges 
uniformly  to  F(x). 

Proof.  Let  {T.  <x)  }  i*l,...,k  be  a  sequence 

x  f  n 

of  estimators  converging  uniformly  to  F(x),  i.e., 

P(lim  sup  |T .  (x)  -  F (x) | =  0 )  =  1  for  i=l,...,k 

n->°°  — oo<x<°° 


and  let  k<°°. 

k 

Now  let  T  (x)  =  Z  a. T.  (x) 
n  i=1  i  i,n 

k 

and  Z  a .  =  1 

i=l  1 


for  0<a . <1 

—  i— 

lim  sup  (T  (x)  -  F(x) | 
rj-*-oo  —  oo<x<°o  ^ 
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=  lim  sup 

n -*-<*>  — co<x<°° 


£  lim  sup 
fl-yoo  _oo<x<°° 


k  k 

|  E  a  .T  .  (x)  -  E  a  F(x) | 

i=l  1  i,n  i=1  i 


k  It 

E  a . ' 1  ,n 
i=l  1 


(x) 


F (x)  | 


k 

<  E  a.  lim  sup  |T.  (x)  -  F(x)| 

i=l  1  n->-°°  — o°<x<°°  ^ 9 


since  k<°° 

Each  term  in  the  sum  is  zero  by  hypothesis.  The  uniform 
convergence  of  the  finite  convex  combination  follows 
immediately. 

Applying  the  previous  lemma  to  the  function  SF(x) 
as  defined  in  equation  3.6,  we  can  state  the  following 
theorem. 


Theorem  3.6.  SF(x)  as  defined  in  equation  3.6, 
converges  uniformly  to  F(x). 

At  this  point  we  have  an  estimator  SF(x)  of  F(x) 
which  is  itself  a  continuous,  differentiable  distribution 
function  and  also  converges  uniformly.  The  same  results, 
however,  are  not  available  for  the  derivative,  sf(x). 

While  it  is  true  that  sf(x)  is  continuous  and  differentia¬ 
ble  almost  everywhere,  convergence  properties  will  have  to 
be  inferred  from  the  Monte  Carlo  analysis  of  Chapter  IV. 
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Although  the  estimator  family  has  been  defined 
and  the  properties  listed,  a  quick  glance  at  Figures  3.3 
and  3.4  indicates  possible  room  for  improvement.  If  we 
could  dampen  some  of  the  sinusoidal  activity  in  both  the 
sample  cumulative  and  sample  density  functions,  our  esti¬ 
mators  should  better  approximate  the  underlying  process. 
Two  methods  of  such  a  smoothing  were  initially  investi¬ 
gated:  spline  smoothing  and  a  Fourier  smoothing  method. 

Once  SF(x)  and  sf(x)  have  been  determined  we  can 
generate  their  values  at  each  data  point  to  form  the 

sets  {SF (X . ) } .  .  and  {sf(X.)>.  ,  .  At  this 

i  1—.J.  /  •  •  •  §  n  i  i-l  /  •  •  •  f  n 

point,  however,  note  that  we  are  not  restricted  to  the 

original  data  set.  We  could  choose  a  set  { Z  . }  .  . 

j  3=1 , . . . ,m 

and  its  corresponding  sets  (SF(Z.)}.  ,  and 

3  j~l , . . . ,m 

(sf (Z  ,  _  by  an  arbitrary  rule,  such  as  equally 

spaced  points  in  the  domain  or  inversion  of  SF(x)  at  some 
specified  plotting  positions.  Thus  m,  the  number  of 
points  used  in  smoothing,  can  be  as  large  (or  small)  as 
we  choose . 

To  apply  spline  smoothing  (Ref  109)  we  can  proceed 
in  two  directions:  (1)  independently  smooth  both  the  dis¬ 
tribution  and  density  functions,  or  (2)  smooth  only  the 
distribution  (density)  function  and  analytically  differen¬ 
tiate  (integrate)  to  get  the  density  (distribution) 
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function.  Proceeding  in^either  of  these  directions  opens 
the  possibility  of  negative  density  values. 

A  second  smoothing  technique  was  hypothesized  from 
the  density  and  cumulative  estimation  work  of  Kronmal  and 
Tarter  (Refs  40  ,  48)  .  Their  investigation  yielded  estimates 
with  impressive  mean  integrated  square  errors  (MISEs) . 
Analogous  to  the  spline  methods,  we  could  use  the  Fourier 
approximation  method  of  Kronmal  and  Tarter  independently  for 
the  distribution  and  density  functions  or  separately  and 
derive  the  othdr.  The  same  drawback  occurs  using  the 
Fourier  expansion  as  with  splines--negative  density  values. 
Since  our  initial  goal  in  this  development  was  to  preserve 
the  distribution  function  properties  of  our  estimators  as 
well  as  add  differentiability,  it  would  be  foolish  at  this 
point  to  abandon  this  aim  in  favor  of  the  possible  smooth¬ 
ing  advantages  of  spline  or  Fourier  expansions.  Thus,  both 
spline  smoothing  and  the  use  of  Fourier  expansions  were 
discarded. 

The  availability  of  both  distribution  and  density 
function  estimates  at  arbitrary  points  in  the  domain  sug¬ 
gested  an  alternative  approach.  In  a  1979  article,  Efron 
(Ref  23)  developed  a  "bootstrap  method"  related  to  the 
"double  Monte  Carlo"  method  proposed  by  Moore  (Ref  59). 

Both  methods  estimate  the  distribution  function  based  on 
sample  data  and  then  create  a  pseudosample  by  sampling 
from  this  estimated  distribution.  Rather  than  sampling 
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from  the  estimated  distribution,  as  these  authors  suggest, 

consider  inverting  the  estimated  distribution  at  specific 

points  according  to  some  rule.  Specifically,  solve 

SF ( Z  .  . . )  =  G.  for  Z,..,  where  {G  . }  .  ,  m  are  pre- 

ID)  D  (D)  D  D=l»  •  •  •  ,nt 

determined  plotting  positions.  The  set  {Z...}  .  , 

I D )  D_i , • *  * , m 

is  now  a  pseudosample  based  on  some  regular  divisions,  the 
plotting  positions  G^,  of  SF(x).  Having  generated  this 
pseudosample,  now  apply  equations  3.6  and  3.7  to  form  new 
estimates  of  the  distribution  and  density  functions.  Of 
course,  this  inversion  process  could  be  repeated  and  other 
estimates  formed  on  the  basis  of  new  pseudosamples. 

The  previous  derivation  clearly  preserves  the  dis¬ 
tribution  function  properties  of  the  estimators,  as  well 
as  differentiability  and  continuity.  By  inverting  SF(x) 
at  the  plotting  positions  G^,  we  also  preserve  ordering 
and  spacing  information  contained  in  the  original  sample, 
in  contrast  to  the  random  sampling  procedures  of  Moore 
and  Efi ~n.  Although  no  formal  proof  of  uniform  conver¬ 
gence  of  this  smooth  distribution  function  estimator  is 
presented,  empirical  evidence  from  graphical  and  Monte 
Carlo  analysis  of  this  estimator  strongly  suggests  that 
uniform  convergence  is  preserved.  We  will  postpone  a 
detailed  analysis  of  these  estimators  to  the  results  of 
Monte  Carlo  analyses  of  the  next  chapter. 

Figures  3.5  through  3.9  give  a  graphical  display 
of  the  smoothing  technique  proposed  for  our  random  sample 
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of  size  20  from  the  normal  distribution.  Figures  3.5  and 
3.6  show  the  smoothed  approximation  and  the  true  underly¬ 
ing  standard  normal  distribution.  Figures  3.7  and  3.8 
compare  the  smoothed  approximation  to  a  normal  distribu¬ 
tion  whose  parameters  are  minimum  variance  unbiased  esti¬ 
mates.  Note  the  performance  of  the  nonparametric  model 
without  the  assumption  of  normality.  Figure  3.9  compares 
the  smoothed  approximation  to  the  empirical  cumulative 
distribution  function.  Choices  for  the  plotting  positions 
inversion  points,  and  other  variables  have  been  made  using 
methods  discussed  in  the  next  section. 

Choice  of  Variables  for 
the  Estimators 

Since  the  approximation  method  and  smoothing  tech¬ 
nique  have  been  defined,  we  now  seek  to  identify  the  vari¬ 
ables  needed  to  form  our  final  estimators.  The  investiga¬ 
tion  will  examine  five  sets  of  variables:  (1)  the  number 
of  subsamples  for  a  given  sample  size;  (2)  plotting 
positions,  {G.}  •  ,  *  for  each  subsample;  (3)  extrapo 

lation  values,  and  for  each  subsample; 

(4)  inversion  points  for  the  smoothing  routine  to  generate 
the  pseudosample;  and  (5)  the  number  of  inversions. 
Judicious  choices  of  these  sets  of  variables  should  give 
us  an  estimator  with  good  approximating  properties. 

Due  to  the  array  of  possible  choices  of  the  vari¬ 
ables  and  their  complex  interaction  in  the  estimators,  it 
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was  necessary  to  restrict  each  set  of  variables  to  a  manage¬ 
able  set  of  choices.  We  will  rely  on  numerical  and  Monte 
Carlo  analysis  to  determine  the  choices  for  our  variables. 

No  claim  of  optimality  will  be  made,  but  we  will  attempt  to 
justify  our  variable  selections  as  reasonable  for  the 
situations  considered.  First,  let  us  examine  each  set  of 
variables  and  its  restricted  domain. 

Number  of  Subsamples .  Given  an  ordered  sample  of 
size  n,  let  k  be  the  number  of  subsamples  generated  via 
the  method  outlined  earlier  in  this  chapter.  We  require 
that  k£n/2,  for  each  subsample  to  contain  at  least  two 
points,  and  also  that  k  remains  finite  as  n  approaches 
infinity  to  satisfy  the  uniform  convergence  of  the 
unsmoothed  estimator  of  equation  3.6.  For  samples  of  size 
100,  k  was  initially  chosen  as  an  element  of  {5,  10,  15,  20}. 
Subsequent  choices  of  the  domain  of  k  were  made  and  will 
be  identified  at  appropriate  steps  in  the  analysis. 

Plotting  Positions .  Given  each  ordered  subsample 
of  size  n*,  a  plotting  position  G^,  j=l,...,n*,  is  assigned 
to  each  order  statistic.  The  following  plotting  positions 
were  chosen  from  Table  II. 1: 

1.  Mean  ranks 

2.  Median  ranks 

3.  Midpoint  of  the  jumps  of  the  empirical  dis¬ 


tribution  function 


4 .  Average  of  the  mean  and  mode  ranks 

5.  Any  of  the  above  four  plotting  positions  based 

on  the  entire  sample,  rather  than  each  subsample .  For 

example,  each  Y^  ^  has  plotting  position  G^,  i=l,...,n 

associated  with  it  where  Y . .  ..  =  X , „ . ,  ,  .  ...  =  X  ,  . .  , 

(i»3)  (i+k(3-l))  (1)' 

the  ith  order  statistic  of  the  entire  sample. 


Extrapolation  Values .  For  each  subsample  define 

Y(0)  =  Y(l)  "  A(Y(2)~Y(1))  and  Y(n*+1)  =  Y(n*)  +  A(Y(n*) 
Y(n*_ij)  where  A  is  the  extrapolation  value.  The  choices 

of  A  that  were  considered  are: 

1.  0,  which  puts  a  finite  probability  at  each 
extreme  order  statistic  of  each  subsample 

2.  0.5 


3.  1.0 

4  .  1.5 

5.  Choose  A  equal  to  the  ratio  G^/ (G2~G^) .  This 
choice  extrapolates  the  data  points  proportionately  to 
their  plotting  positions.  Since  the  plotting  positions 
listed  previously  are  symmetric,  A  is  also  equal  to 
(1-Gn*) / (Gn*-Gn*_^) .  Note  that  if  plotting  position  5  is 
used,  then  the  extrapolation  points  are  calculated  only 
once  based  on  the  entire  sample  and  then  remain  constant 
for  each  subsample. 


Inversion  Points .  Once  the  subsamples  are  defined. 


we  need  a  rule  for  inverting  equation  3.6  to  create  a 


pseudosample.  Our  choices  for  inversion  points  are  the 
first  four  plotting  positions  listed  previously  based  on 
the  entire  sample.  Thus  the  pseudosample  (Z.) 
is  defined  by  Z^=SF  ^(G^)  where  is  one  of  the  four 
plotting  conventions  based  on  a  sample  of  size  N.  Numeri¬ 
cal  calculations  of  SF  ^(G^)  were  accomplished  via  a 
Newton- Raph son  method.  Adjustments  to  the  extreme  points 
of  the  pseudosample  were  sometimes  necessary.  See  Appen¬ 
dix  6  for  a  further  discussion. 

Number  of  Inversions.  Since  the  inversion  process 
can  be  repeated  by  creating  another  pseudosample,  the 
number  of  repetitions  needs  to  be  determined.  Due  to  the 
computational  effort  required  and  some  preliminary  investi 
gation  of  repeated  smoothing,  a  maximum  of  two  inversions 
was  considered  practical .  Estimators  smoothed  more  than 
twice  improved  very  little,  if  at  all.  Thus  the  number 
of  inversions,  I,  was  constrained  to  the  set  {0,  1,  2}. 

Now  that  we  have  restricted  our  variables  to  man¬ 
ageable  sets,  let  us  now  describe  the  procedure  for  select 
ing  specific  distribution  function  estimators  by  identify¬ 
ing  particular  choices  of  our  variances.  Our  goal  is  to 
provide  reasonable  values  for  these  variables  in  a  limited 
situation  in  the  hope  of  robustness  over  a  wider  class. 

To  that  end,  let  us  consider  only  sample  size  100  for  the 
present.  We  also  need  a  criterion  for  choice  of  the 
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variables.  A  widely  accepted  criterion  is  mean  integrated 
square  error  (MISE)  (Refs  40  ,  48  ,  103,  104,  105)  .  MISE  = 

r  A  2 

E  j_  [f(x)-f(x)J  w(x)  (dx)  ,  where  f  is  the  true  function, 

A 

f  is  the  estimator,  and  w  is  the  weight  function.  The 
integrated  square  error  can  be  approximated  numerically 
since  our  estimators  are  continuous.  As  a  criterion,  we 
will  use  an  approximation  to  the  integrated  square  error 
for  both  the  distribution  and  density  functions.  For  com¬ 
parison  purposes,  other  criteria  wore  also  used.  These 
included  Kolmogorov-Smirnov  (K-S)  distance,  K-S  integral 
and  modified  K-S  integral  distances,  Cramer  von  Mises  (CVM) 
and  modified  CVM  integrals,  Anderson-Darling  (AD)  and  modi¬ 
fied  AD  integrals  and  average  square  error  (ASE) .  For  a 
discussion  of  these  criteria,  see  Appendix  1. 

To  numerically  evaluate  the  variable  choices,  we 
also  need  to  know  the  true  underlying  distribution.  We 
chose  three  members  of  the  Generalized  Exponential  Power 
Distribution  family  as  our  test  distributions  (see  Appen¬ 
dix  2).  The  members  chosen  were  the  double  exponential, 
normal,  and  uniform  distributions.  Although  restricting 
ourselves  to  a  symmetric  family,  the  three  members  selected 
give  three  distinct  measures  of  tail  length,  ranging  from 
leptokurtic  to  mesokurtic  to  platykurtic.  The  density 
functions  also  possess  unique  central  shapes — the  double 
exponential  being  concave,  the  normal  convex,  and  the 
uniform  linear.  As  such,  it  was  conjectured  that 
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estimators  which  performed  well  over  this  limited  set  of 
distributions  would  perform  well  over  a  much  wider  class. 

The  variable  selection  procedure,  itself,  consisted 
of  two  main  steps:  examination  of  "stylized"  samples  and 
examination  of  random  samples.  We  shall  deal  with  each  in 
turn . 


Stylized  Samples.  Given  a  sample  size  of  100,  we 
generated  a  "stylized"  sample  by  inverting  each  test  dis¬ 
tribution  at  the  inversion  points.  We  repeated  the  process 
for  all  four  possible  inversion  values.  Next,  we  calcu¬ 
lated  values  for  all  of  the  distance  criteria  for  the  400 
combinations  of  the  number  of  subsamples,  plotting  posi¬ 
tions,  extrapolation  values  and  inversion  points.  The 
rationale  at  this  stage  is  related  to  the  underlying 
philosophy  of  Fisher  consistency  (Ref  73:281).  Strict 
Fisher  consistency  requires  that  an  estimator  yield  the 
true  parameter  when  true  proportions  are  realized  in  the 
sample.  For  our  purposes,  we  require  an  estimator  to  be 
reasonably  close  to  the  true  value  when  the  input  sample 
is  stylized.  Table  III.l  summarizes  the  results  of  the 
stylized  sample  analysis.  Four  sets  of  variables  were 
chosen  for  future  consideration  because  of  their  "good" 
performance  with  respect  to  the  modified  CVM  integral 
criterion.  All  three  sets  of  variables  which  minimized 
the  modified  CVM  integral  for  the  distribution  function 
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TABLE  III.l 


! 

\ 


VARIABLE  SETS  BASED  ON  MODIFIED  CVM  INTEGRAL  VALUES 
FOR  THE  DISTRIBUTION  FUNCTION 


Variables  ^ 

Distribution 

Double 

Exponential 

Normal 

Uniform 

(5,3,3, 2) 

6 . 83xl0-7 

3 . 78xl0-7 

1 . 78xl0~6 

(5,4,3, 2) 

3.28x10‘7(2) 

6 . 19xl0-7 

3 . 39xl0~6 

(5, 5, 3, 2) 

6.91xl0"7 

4 .43xl0"6 

1.13xl0-9(2) 

(5, 4, 5, 3) 

1 . 3  2xl0~6 

3.51x10-7(2) 

4 .62xl0~7 

All  entries  listed  are  values  of  the  modified 
Cramer  von  Mises  integral  of  the  distribution  function. 

Note  1:  Variable  sets  are  indexed  based  on  their 
domains  given  earlier  in  this  chapter.  Terms  correspond 
to  (number  of  subsamples,  plotting  position,  extrapolation 
value,  inversion  points) . 

Note  2:  Minimum  modified  CVM  integral  value  for 
that  distribution. 

were  selected.  The  other  set  selected  performed  well  for 
both  che  normal  and  double  exponential  distributions. 

In  examining  the  results  of  the  stylized  sample 
analysis,  four  observations  were  made.  First,  inversion 
points  based  on  the  median  ranks  outperformed  the  other 
choices.  Second,  plotting  position  5  was  clearly  superior 
when  the  underlying  distribution  was  uniform.  This  observa¬ 
tion  confirmed  our  intuition  since  all  of  the  information 
in  a  sample  from  the  uniform  distribution  is  contained  in 
the  two  extreme  order  statistics.  Plotting  position  5  uses 
an  extrapolation  scheme  based  on  the  entire  sample  and  thus 
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estimates  the  bounds  of  the  distribution  better  than  using 
extrapolated  points  based  on  the  subsamples.  Third,  over¬ 
all,  the  extrapolation  values  appeared  arbitrary.  Fourth, 
the  number  of  subsamples  determined  in  the  "best"  sets  of 
variables  seems  low,  probably  due  to  the  ideal  spacings 
generated  by  the  stylized  samples.  Based  on  these  observa 
tions,  we  decided  to  fix  the  plotting  positions,  extrapola 
tion  values,  and  inversion  points  as  determined  by  the 
four  best  variable  sets.  For  these  combinations,  we  now 
want  to  evaluate  the  functions  on  a  limited  number  of 
random  samples. 

Random  Samples.  Given  a  fixed  set  of  four  combina 
tions  of  plotting  positions,  extrapolation  values,  and 
inversion  values  as  determined  from  the  stylized  samples, 
we  now  propose  to  determine  choices  for  the  number  of  sub¬ 
samples  and  the  number  of  inversions.  Twenty-five  random 
samples  of  size  100  from  each  of  the  test  distributions 
were  drawn  and  evaluated  via  averaged  modified  CVM  inte¬ 
grals  for  both  the  distribution  and  density  functions. 
Table  III. 2  lists  the  optimal  choices  of  the  sets  of  vari¬ 
ables  with  respect  to  the  CVM  criteria.  Based  on  the 
results  of  the  random  sample  analysis,  four  conclusions 
were  drawn:  (1)  there  is  no  clear-cut  optimal  choice  of 
variables  across  all  three  test  distributions;  (2)  the 
optimal  choice  for  the  uniform  performs  poorly  for  the 
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TABLE  III.  2 


OPTIMAL 

CHOICES  FROM  RANDOM  SAMPLES 

Modified  CVM  Integral  Values 

Variables ^ 

Distribution  Function 

Density  Function 

1. 

Double  Exponential 

A.  (5,4, 5,3,0) 

B.  (15,4,3,2,2) 

7.56xl0~4(2) 

7.80xl0~4 

3.19xl0“2 

1.52x10_3(2) 

2. 

Normal 

A.  (25,4,3,2,1) 

B.  (25,4,3,2,2) 

1.27xl0~3 

1.17xl0-3(2) 

1.12x10"3(2) 

1.31xl0*3 

3. 

Uniform 

(25,5,3,2,2) 

5.00x10~4(2) 

1.22x10_3(2) 

in 

Note  1:  Variables  are  listed  in  the 
Table  III.1  with  the  last  variable  added 

same  order  as 
being  the 

number  of  inversions. 

Note  2:  Denotes  minimum  value  for  that  criterion 
and  distribution. 
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other  two  distributions;  (3)  plotting  position  4,  the 
average  of  the  mean  and  mode  ranks,  outperformed  plotting 
position  3,  the  midpoint  of  the  jumps  of  the  empirical  dis¬ 
tribution  function,  in  every  case;  and  (4)  the  inversion 
values  at  the  median  ranks  outperformed  the  others  in  most 
cases.  From  these  observations,  we  decided  on  forming 
three  different  models  using  the  optimum,  or  nearly  opti¬ 
mum,  choices  for  each  test  distribution.  Table  III. 3 
summarizes  the  three  models.  Model  1  was  developed  from 
nearly  optimum  choices  based  on  the  double  exponential  dis¬ 
tribution,  Model  2  from  the  normal  distribution,  and  Model  3 
from  the  uniform  distribution.  These  models  were  derived 
solely  for  sample  size  100.  Other  random  sample  sizes 
were  then  investigated.  Given  random  samples  of  size  20 ,  50, 
175,  and  250,  we  fixed  all  of  the  model  parameters  except 
for  the  number  of  subsamples.  We  also  introduced  a  sixth 
pair  of  variables,  N,  the  number  of  points  to  invert,  and 
K,  the  number  of  subsamples  used  after  an  inversion.  Based 
on  twenty-five  random  samples  from  each  sample  size  and 
using  the  modified  CVM  integral  criterion,  we  developed 
nearly  optimal  selections  of  the  number  of  subsamples,  k, 
as  well  as  N  and  K.  Table  III. 4  gives  the  relationships 
between  sample  size  and  the  number  of  subsamples  for  the 
three  models  based  on  their  corresponding  GEP  distribution. 
These  selections  were  denoted  nearly  optimal  for  two 
reasons.  First,  only  a  very  few  cases  had  N,  the  number  of 
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TABLE  III.  3 


NONPARAMETRIC  MODELS  1,  2,  AND  3 


Model  1 


Number  of  subsamples 
Plotting  positions 
Extrapolation  value 
Inversion  points 
Number  of  inversions 


15 

average  of  mean  and  mode  ranks 

1.0 

median  ranks 
2 


Model  2 


Number  of  sub samples 
Plotting  positions 
Extrapolation  value 
Inversion  points 
Number  of  inversions 


25 

average  of  mean  and  mode  ranks 

1.0 

median  ranks 
1 


Model  3 


Number  of  subsamples 
Plotting  positions 
Extrapolation  value 
Inversion  points 
Number  of  inversions 


33 

median  ranks  of  the  entire  sample 

1.0 

median  ranks 
2 


All  models  are  valid  for  sample  size  100  only. 
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TABLE  III. 4 


NUMBER  OF 

SUBSAMPLES 

VERSUS  SAMPLE  SIZE 

Model 

Sample 

Size 

(n) 

Number  of 
Subsamples 
(k) 

Number  of 
Inversion 
Points 
(N) 

Number  of 
Subsamples 
(K) 

1 

20 

5 

20 

5 

50 

10 

50 

10 

100 

15 

100 

15 

175 

30 

100 

15 

2  50 

45 

100 

15 

2 

20 

10 

20 

10 

50 

25 

50 

25 

100 

25 

100 

25 

175 

35 

100 

25 

250 

50 

100 

25 

3 

20 

10 

20 

10 

50 

25 

50 

25 

100 

33 

100 

33 

175 

80 

100 

33 

250 

125 

100 

33 

50 


inversion  points,  greater  than  100  as  the  optimal  choice. 
The  difference  in  the  CVM  criteria  for  the  optimal  choice 
and  the  value  listed  in  Table  III. 4  was  insignificant. 

For  example,  for  sample  size  50  using  Model  3,  the  range 
of  values  for  the  modified  CVM  integral  was  [.00088, 
.00190]  for  the  distribution  function  and  [.00189,  .00760] 
for  the  density  function.  The  actual  values  chosen 
correspond  to  .00088  and  .00190  for  the  distribution  and 
density  functions  respectively.  Thus,  the  decrease  in  the 
criteria  did  not  justify  the  added  computational  effort 
to  invert  more  than  100  points.  The  number  of  points  in 
each  pseudosample,  N,  was  defined  using  the  following 
algorithm: 


20 

n<20 

n 

20<n<100 

100 

n>100 

The  number  of  subsamples  for  the  pseudosample,  K,  was 
defined  to  be  the  corresponding  k  for  n=N.  Second,  due 
to  the  high  variability  of  such  a  small  Monte  Carlo  sample 
size,  we  again  opted  for  reasonable  values  which  followed 
a  generally  regular  trend. 

The  number  of  subsamples  for  sample  sizes  not 
listed  in  Table  III. 4  was  arbitrarily  determined  by  con¬ 
structing  step  functions  for  each  model  such  that  the 
average  number  of  points  in  each  subsample  followed  a  near 
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linear  interpolation  through  the  k  versus  n  points  listed 
in  the  table.  For  sample  sizes  greater  than  250,  we  use 
the  value  of  k  for  n=250.  This  choice  allows  the  models 
to  exhibit  the  uniform  convergence  property  shown  earlier 
in  this  chapter  since  the  number  of  subsamples  stays  finite. 
Figures  3.10,  3.11,  and  3.12  show  the  plots  of  k  versus  n 
for  the  three  models.  Figure  3.13  shows  the  k-n  relation¬ 
ship  for  model  2*  developed  in  conjunction  with  an  adap¬ 
tive  procedure  discussed  in  the  next  section.  Table  III. 5 
shows  the  relationship  of  the  average  number  of  points  in 
each  subsample  to  the  sample  size  for  the  three  models. 

Adaptive  Approaches 

Each  of  the  three  models  generated  in  the  previous 
section  was  based  on  stylized  and  random  samples  from  a 
specific  distribution.  The  variables  for  Models  1,  2,  and 
3  were  chosen  by  comparison  with  the  double  exponential, 
normal,  and  uniform  distributions  respectively.  While  the 
models  are  strictly  nonparametric  and  perform  well  given  a 
specific  underlying  distribution,  their  performance  for  an 
unknown  distribution  is  yet  undetermined. 

Since  the  three  members  of  the  GEP  distribution 
represent  vast  differences  in  shapes  and  tail  length,  and 
since  each  nonparametric  model  proposed  has  been  associ¬ 
ated  with  a  specific  member  of  the  GEP  family,  it  became 
a  natural  extension  to  consider  a  nonparametric  adaptive 
model  using  the  three  models  already  developed. 
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001 


Figure  3.10.  k  vs  n  Plot — Model 
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Figure  3.11.  k  vs  n  Plot--Model 
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Figure  3.12.  k  vs  n  Plot--Model 


nooet  e« 


Figure  3.13.  k  vs  n  Plot--Model  2* 


TABLE  III. 5 


SELECTED  VALUES  OF  k  AND  n  FOR  THE  NONPARAMETRIC  MODELS 


Sample 

Size 

(n) 

Model  1 

Model  2 

Model  3 

Model  2* 

k 

n/k 

k 

n/k 

k 

n/k 

k 

n/k 

5 

2 

2.5 

2 

2.5 

2 

2.5 

2 

2.5 

10 

3 

3.33 

5 

2.0 

5 

2.0 

2 

5.0 

15 

3 

5.0 

7 

2.14 

7 

2.14 

3 

5.0 

20 

5 

4.0 

10 

2.0 

10 

2.0 

4 

5.0 

25 

5 

5.0 

12 

2.08 

12 

2.08 

5 

5.0 

50 

10 

5.0 

25 

2.0 

25 

2.0 

10 

5.0 

75 

15 

5.0 

25 

3.0 

33 

2.27 

15 

5.0 

100 

15 

6.67 

25 

4.0 

33 

3.33 

20 

5.0 

150 

25 

6.0 

30 

5.0 

50 

3.0 

30 

5.0 

200 

35 

5.71 

40 

5.0 

100 

2.0 

40 

5.0 

250 

45 

5.56 

50 

5.0 

125 

2.0 

50 

5.0 
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To  develop  such  a  model,  we  need  a  discriminant. 

In  the  case  of  symmetric  distributions,  three  discrimin¬ 
ants  based  on  tail  length  have  been  used:  kurtosis,  Hogg's 
Q  statistic,  and  percentile  ratios.  Applications  of  the 
discriminants  in  parametric  estimation  problem  can  be 
found  in  Andrews,  et  al.,  Daniels,  Harter,  et  al.,  Hogg, 
McNeese,  and  Moore,  to  name  only  a  few  (Refs  5,  17,  34,  38, 
55  ,  60).  For  our  purposes,  we  do  not  wish  to  restrict  our¬ 
selves  to  modeling  only  symmetric  populations.  Both 
kurtosis  and  Hogg's  Q  statistic  are  not  compatible  with  the 
asymmetric  case.  They  tend  to  average  the  measures  of  both 
upper  and  lower  tail  length.  However,  it  is  possible  to 
use  percentile  ratios  as  a  discriminant  for  each  tail 
individually.  Thus,  we  can,  heuristically  at  least, 
envision  a  model  which  could  adequately  portray  a  lepto- 
kurtic  tail  on  one  end  and  a  platykurtic  tail  on  the  other. 


Percentile  Ratios .  Let  F  be  a  continuous  distribu¬ 
tion  function.  Now  define  the  lower  and  upper  percentile 
ratios,  PL  and  PU  as  follows: 


PL  = 


F~1(.5)  ~  F"1(.025) 
F_1  (  .  5)  -  F_1  (  .  25) 


F~1(.975)  -  F~1(,5) 
F-1( .975)  -  F_1( .75) 
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By  construction  PL  and  PU  are  greater  than  or  equal  to 
unity.  Table  III. 6  lists  the  lower  and  upper  percentile 
ratios  for  some  common  distributions. 

The  next  step  was  to  examine  the  distributions 
of  the  percentile  ratios  themselves.  We  approximated  these 
distributions  by  our  nonparametric  models.  Monte  Carlo 
samples  of  size  20,  50,  100,  175,  250,  and  500  were  drawn 
from  each  of  the  three  GEP  test  distributions.  The  lower 
percentile  ratio  was  then  calculated.  The  process  was 
repeated  100  times  to  get  100  values  of  PL  for  each  sample 
size  and  test  distribution.  This  is  equivalent  to  100 
values  of  PU  since  the  random  samples  were  drawn  from 
symmetric  populations.  We  then  used  our  nonparametric 
models  to  generate  approximate  distribution  functions  for 
PL  (or  PU)  at  each  test  distribution  and  sample  size. 

Model  1  was  used  for  the  distribution  of  the  percentile 
ratios  computed  from  uniform  and  double  exponential  random 
samples.  Model  2  was  used  for  the  distribution  computed 
from  normal  random  samples.  Selection  of  these  models  was 
based  on  both  graphical  characteristics  and  the  sample 
percentile  ratios.  At  this  point  we  imposed  two  constraints. 
First,  since  Model  3  tended  to  perform  poorly  if  the  true 
distribution  was  not  uniform,  we  shall  only  use  Model  3 
when  the  sample  strongly  suggests  a  shape  resembling  the 
uniform.  Let  SPR  be  the  sample  percentile  ratio,  either 
lower  or  upper,  and  let  PR^  and  PR^  be  the  values  of  the 
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TABLE  III. 6 


POPULATION  PERCENTILE  RATIOS 


Distribution 

Percentile 

Lower 

Ratios 

Upper 

Normal 

2.904 

2.904 

Uniform 

1.900 

1.900 

Double  Exponential 

4.322 

4.322 

Triangular 

2.651 

2.651 

Cauchy 

12.706 

12.706 

Exponential 

1.647 

4.322 

Weibull  (2) 

2.274 

3.155 

Weibull  (3) 

2.630 

2.870 

Beta  (1,  2) 

1.764 

2.651 

Beta  (h,  \) 

1.409 

1.409 

Largest  Extreme  Value 

2.410 

3.764 

Shape  parameters  are  given  in  parentheses.  Tri¬ 
angular  distribution  has  support  [-2,2]  Beta  distribution 
has  support  [0,1].  All  other  distributions  have  been 
standardized  with  location  parameter  zero  and  scale 
parameter  one . 
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percentile  ratio  where  the  adaptive  procedure  switches 
models.  We  set  P(SPR  <  PR^  j  uniform  distribution)  =  .5. 
Second,  since  both  Models  1  and  2  perform  reasonably  well 
for  both  the  double  exponential  and  the  normal  distribu¬ 
tions,  setP(SPR  <  PR2  |  double  exponential  distribution)  = 

P (SPR  >  PR^  |  normal  distribution) .  Thus,  we  equate  the 
probabilities  of  an  incorrect  choice.  Based  on  these  two 
constraints  and  our  nonparametric  distribution  functions, 
we  solved  for  PR^  and  PR2  across  all  sample  sizes  con¬ 
sidered.  Values  derived  were  PR^=1.9  and  PR2=3.5. 

Table  III. 7  lists  the  approximate  probabilities  for  the 
sample  3  -  percentile  ratio  falling  in  any  of  the  three 

intervals  defined  by  PR1  and  PR2  for  the  three  underlying 
distributions  and  various  sample  sizes. 

The  construction  of  our  nonparametric  estimators 
allows  the  use  of  only  one  model  for  each  sample  con¬ 
sidered.  Having  two  different  percentile  ratios  creates 
an  ambiguity  as  to  which  model  to  finally  choose.  We 
resolved  this  dichotomy  in  two  ways.  First,  Model  1 
seemed  to  perform  better  when  the  underlying  population  was 
normal  than  Model  2  performed  if  the  underlying  population 
was  double  exponential.  So,  we  chose  Model  1  if  both 
Models  1  and  2  are  indicated.  Actually,  it  turns  out  that 
the  model  number  is  its  relative  order  of  precedence. 
Second,  we  discovered  that  the  uniform  distribution  could 
also  be  approximated  well  by  using  either  Models  1  or  2  and 


TABLE  III. 7 


SELECTED  PROBABILITIES— LOWER  PERCENTILE  RATIO  (PL) 


Sample 

UNIFORM  DISTRIBUTION 

Size 

P  ( PL<_1 . 9 ) 

P  ( 1 . 9<  PL<  3.5) 

P(PL>3. 5) 

20 

.4326 

.5025 

.0649 

50 

.5178 

.4738 

.0084 

100 

.5541 

.4428 

.0031 

175 

.5085 

.4915 

0 

250 

.5544 

.4456 

0 

500 

.4881 

.  5119 

0 

Sample 

NORMAL  DISTRIBUTION 

Size 

P(PL<1.9) 

P(1 ,9<PL<3 . 5) 

P(PL>3.5) 

20 

.0994 

.5711 

.3295 

50 

.0354 

.7273 

.2373 

100 

.0350 

.  7992 

.1658 

175 

.0080 

.8753 

.1167 

250 

.0068 

.9295 

.0637 

500 

0 

.96  58 

.0342 

Sample 

DOUBLE  EXPONENTIAL  DISTRIBUTION 

Size 

P ( PL<1 . 9 ) 

P (1 . 9<PL<3 . 5) 

P(PL>3.5) 

20 

.0592 

.2715 

.6693 

50 

.0231 

.1851 

.7918 

100 

.0026 

.1594 

.8380 

175 

.0012 

.1222 

.8766 

250 

.0013 

.0972 

.9015 

500 

0 

.0375 

.9625 
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forcing  the  extrapolated  points  for  each  subsample  to  be 
constants.  These  points  are  based  on  extrapolation  from 
the  entire  sample. 

From  the  previous  three  models  and  the  fixed 
extrapolation  point  modification,  Models  4  and  5  were 
developed.  Model  4  uses  the  first  three  models  depending 
on  the  values  of  the  sample  percentile  ratios.  Model  5 
uses  only  Models  1  and  3. 

In  analyzing  the  relationship  of  k,  the  number  of 
subsamples,  and  n,  the  sample  size,  it  was  evident  from  a 
graphical  standpoint  that  the  ratio  of  k/n  determined  how 
much  detail  the  approximation  possessed.  So  a  choice  of 
a  nominal  ratio  of  k/n  seemed  appealing.  Since  Models  1 
and  2  performed  reasonably  well  for  double  exponential  and 
normal  random  samples,  we  postulated  another  model  which 
is  a  compromise  between  the  two  in  the  sense  of  the  k/n 
ratio.  We  chose  the  simple  expression: 


k  = 


n+4 

5 

50 


n<250 

n>250 


Thus,  for  samples  of  size  250  or  less,  each  subsample  con¬ 
tains  either  4  or  5  data  points.  Like  Model  2,  we  kept 
the  number  of  inversions  at  one.  Denote  this  new  model 
as  Model  2*  since,  with  the  exception  of  the  new  choice 
of  k,  it  uses  the  same  variables  as  Model  2.  An  adaptive 
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procedure,  Model  6,  was  based  on  Models  2*  and  3.  A  sum¬ 
mary  of  all  three  adaptive  models  is  given  in  Table  III. 8. 


Summary 

This  chapter  has  traced  the  derivation  of  a  non- 
parametric,  continuous,  differentiable,  sample  distribu¬ 
tion  function.  First,  we  considered  a  simple  scheme  to 
extend  plotting  positions  to  a  continuous,  differentiable 
function.  Then,  we  improved  on  our  distribution  and  den¬ 
sity  estimators  by  the  use  of  averaging  functions  based  on 
subsamples,  similar  to  the  jackknife.  Next  we  investi¬ 
gated  the  properties  of  uniform  convergence  and  of  distri¬ 
bution  functions  as  they  apply  to  our  new  estimators. 
Theorem  3.6  concludes  the  uniform  convergence  arguments. 

A  smoothing  routine,  which  again  preserves  the  distribu¬ 
tion  function  properties,  was  introduced.  Next,  a  detailed 
analysis  of  stylized  and  random  samples  from  representative 
members  of  the  Generalized  Exponential  Power  distribution 
resulted  in  selection  of  three  initial  nonparametric 
models.  With  the  addition  of  the  percentile  ratios  as 
discriminants  of  tail  length,  three  adaptive  models  were 
then  defined.  Having  completed  the  theoretical  develop¬ 
ment  of  our  six  chosen  models,  our  next  goal  is  an  evalua¬ 
tion  and  comparison  of  these  techniques  as  estimators. 


64 


TABLE  III.  8 


1 


DECISION  RULES  FOR  ADAPTIVE  MODELS 


Percentile 

Ratios 

Lower 

Upper 

Model 

4 

[1.0, 1.9) 

11.0,1.9) 

Model 

3 

[1.0, 1.9) 

[1.9, 3. 5] 

Model 

2--fixed 

[1.0, 1.9) 

(3,5,-) 

Model 

l--fixed 

[1.9, 3. 5] 

[1.0, 1.9) 

Model 

2-fixed  X(n+1) 

[1.9, 3. 5] 

[1.9,3.5] 

Model 

2 

[1.9, 3. 5] 

(3.5,-) 

Model 

1 

(3.5,») 

[1.0, 1.9) 

Model 

1  fixed  X(n+1) 

(3. 5,°o) 

[1.9,3.51 

Model 

1 

(3.5,oo) 

(3.5,oo) 

Model 

1 

Percentile 

Ratios 

Lower 

Upper 

Model 

5 

[1.0, 1.9) 

[1.0, 1.9) 

Model 

3 

[1.0, 1.9) 

[1.9,-) 

Model 

l--fixed 

(1.9,-) 

[1.0, 1.9) 

Model 

1  fixed  X(n+1) 

(1.9,oo) 

(1.9,—) 

Model 

1 

Percentile 

Ratios 

Lower 

Upper 

Model 

6 

[1.0, 1.9) 

11.0,1.9) 

Model 

3 

[1.0, 1.9) 

[1.9,-) 

Model 

2* — fixed 

(1.9,oo) 

[1.0, 1.9) 

Model 

2*  fixed  X(n+1) 

(1.9,oo) 

(1.9,—) 

Model 

2* 

IV.  Distribution  and  Density  Function  Estimation 


Introduction 

Having  constructed  six  nonparametric  models,  we 
now  propose  to  evaluate  their  performance  and  demonstrate 
their  feasibility.  We  begin  by  surveying  several  other 
authors'  estimates  of  the  distribution  function,  both  con¬ 
tinuous  estimates  and  step  functions.  Estimates  of  the 
density  function  are  then  examined.  These  include  kernel 
estimates,  orthogonal  series  estimates,  delta  sequences  and 
a  more  recent  entropy  based  estimate.  The  new  nonparametric 
estimators  are  then  compared  on  the  basis  of  mean  integrated 
square  error  of  both  density  and  distribution  functions. 
Tables  are  given  which  list  the  results  of  Monte  Carlo 
comparisons  of  the  models  over  six  distributions  and  six 
sample  sizes.  The  results  were  compared  with  two  other 
continuous  density  approximations.  Convergence  rates  fo_ 
the  estimators  are  also  approximated.  Next  some  specific 
examples  of  the  models  are  shown  plotted  for  five  differ¬ 
ent  distributions.  Finally  the  hazard  function  is  esti¬ 
mated  and  plotted.  As  a  tool,  the  hazard  function,  coupled 
with  the  density  and  distribution  functions  form  a  power¬ 
ful  discriminant  of  density  types. 


Historical  Survey 

Distribution  Function  Estimation .  We  have  already 
examined  some  estimates  of  distribution  functions  in  our 
discussion  of  sample  distribution  functions  in  Chapter  II. 
Some  were  rather  general,  like  Vogt's  variant  of  the 
empirical  distribution  function,  while  others,  like 
Schuster's,  were  concerned  with  reflecting  points  about 
the  estimated  location  parameter  of  a  symmetric  distribu¬ 
tion.  The  references  in  Chapter  II  describe  rather  simple 
step  function  approaches  to  estimating  the  distribution 
function . 

Several  other  methods  also  merit  discussion.  While 
his  estimate  is  still  a  step  function,  Turnbull  developed 
an  algorithm  to  calculate  the  maximum  likelihood  estimate 

A 

F  of  an  underlying  distribution  function  F.  He  shows 

A 

monotonic  convergence  of  his  algorithm  to  F  and  indicates 
an  application  to  hypothesis  testing,  while  considering 
data  sets  which  are  arbitrarily  grouped,  censored  or  trun¬ 
cated  (Ref  97).  For  an  average  squared  error  loss  func¬ 
tion,  Phadia  showed  that  a  step  function  estimator  F(t) 
is  minimax. 

^(t)  =  2  (m+1)  +  m(m+l)  <5Xi(-°°'t) 

where  m  =  /n  and  6^  is  a  measure  on  which  assigns  a 

i 

unit  mass  to  X . .  He  further  derived  step  function  estimators 
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which  are  best  invariant  and  also  best  invariant  confidence 
bands  (Ref  67)  . 

Continuous  functions  have  also  been  developed. 

Smaga  derives  a  smooth  empirical  distribution  function  in 
a  manner  similar  to  kernel  estimates  for  a  probability 
density  (Ref  86) .  Orthogonal  series  estimators,  based  on 
trigonometric  functions  proposed  by  Kronmal  and  Tarter 
give  a  continuous  approximation  for  the  distribution  func¬ 
tion.  Their  Fourier  series  method  produced  impressive 
mean  integrated  square  error  values.  A  significant  draw¬ 
back  to  the  method  is  the  lack  of  distribution  function 
properties  of  these  estimators  (Refs  40 ,  48 ) . 

While  we  are  primarily  concerned  with  nonpara- 
metric  estimation,  some  rather  general  three  or  four 
parameter  families  of  distributions  can  be  used  to  approxi¬ 
mate  a  distribution  function.  Recently,  one  such  four 
parameter  family  was  introduced  by  Ramberg,  et  al.  Based 
on  a  generalization  of  Tukey's  lambda  function,  this  new 
distribution  approximates  a  wide  range  of  both  symmetric 
and  asymmetric  populations  (Ref  72 ) . 

In  addition  to  the  estimating  methods  presented 
both  in  this  chapter  and  in  Chapter  II,  the  approaches  to 
density  estimation  given  in  the  next  section  provide  the 
opportunity  for  further  distribution  function  estimation. 

As  we  have  seen,  some  authors  attack  the  general  problem 
of  data  modeling  by  investigating  the  distribution  function. 
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We  now  consider  those  who  chose  a  path  of  density  function 
estimation . 

Density  Function  Estimation.  Oldest  among  the 
density  function  estimates  is  the  histogram.  Given  a  set 
of  class  intervals,  the  histogram  is  a  maximum  likelihood 
estimator.  This  dependence  on  internal  selection,  however, 
is  a  serious  drawback.  While  the  method  of  maximum  likeli¬ 
hood  has  been  a  classical  technique,  recently  the  minimum 
distance  method  developed  by  Wolfowitz  has  inspired  numer¬ 
ous  articles,  particularly  in  the  sense  of  parametric 
estimation  (Ref  108)  .  Reiss  proposes  minimum  distance 
estimators  of  unimodal  densities.  He  proves  consistency 
and  gives  a  computational  algorithm.  Using  the  empirical 
distribution  function  and  the  Kolmogorov-Smirnov  distance 
measures,  Reiss'  estimators  are  defined  as  constants 
between  ordered  sample  data  points.  As  such,  the  esti¬ 
mators  are  actually  minimum  distance  histograms  (Ref  74  ) . 

Since  1956,  some  significant  continuous  approxi¬ 
mations  have  emerged.  Much  of  the  literature  has  been 
devoted  to  kernel  estimators,  first  developed  by  Rosenblatt 
(Ref  75).  Most  of  the  important  results  are  summarized  in 
a  recent  book  by  Tapia  and  Thompson  (Ref  94  )  .  Wegman  and 
Davies  discuss  two  recursive  estimators  closely  related  to 
kernel  estimators.  They  also  propose  a  sequential  estima¬ 
tion  procedure  based  on  the  recursive  estimators  (Ref  106). 
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Singh  evaluates  the  mean  square  errors  of  a  density  esti¬ 
mator  of  the  kernel  type  and  its  derivatives  (Ref  85) . 

Some  further  properties  of  kernel  estimators  are  proposed 
by  Schuster  (Ref  81) .  Fourier  inversion  method  of  density 
estimation  is  proposed  by  Blum  and  Susarla.  They  show  this 
estimator  possesses  mean  square  consistency  and  asymptotic 
normality  (Ref  8) . 

Various  estimation  techniques  based  on  orthogonal 
series  expansions  have  also  been  developed.  Kronmal  and 
Tarter  proposed  estimators  of  both  distribution  and  density 
functions  using  Fourier  series.  Expressions  for  the  mean 
integrated  square  error  are  developed  in  terms  of  the  vari¬ 
ances  of  the  Fourier  coefficients.  Both  Schwartz  and 
Walter  evaluate  the  properties  of  a  density  estimator  based 
on  Hermite  functions  which  are  defined  in  terms  of  Hermite 
polynomials  (Refs  84,  100).  Watson  proposes  another  ortho¬ 
gonal  series  estimator  (Ref  102)  .  Crain  uses  the  set  of 
normalized  Legendre  polynomials  on  [-1,1]  as  his  orthogonal 
set.  He  incorporates  both  a  restricted  maximum  likelihood 
approach  and  the  information-theoretic  distance  defined  by 
Kullback  (Ref  14). 

Watson  and  Ledbetter  defined  a  density  estimator 
as  an  average  of  square  integrable  functions.  Expressions 
for  these  functions  are  derived  based  on  a  mean  integrated 
square  error  criterion  (Ref  103) .  Walter  and  Blum  general¬ 
ized  many  of  the  previously  mentioned  methods  into  one 
method  based  on  "delta  sequences,"  sequences  of  functions 
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which  converge  to  a  generalized  function  6.  This  delta 
sequence  method  includes  kernel  estimators,  orthogonal 
series  estimators,  Fourier  transform  estimators  and  histo¬ 
grams  (Ref  101)  .  Convergence  rates  are  also  generalized 
from  the  results  of  Wahba  (Ref  99  ) . 

Parzen  has  attempted  to  incorporate  both  para¬ 
metric  and  nonparametric  schemes  in  an  approach  to  data 
modeling.  He  also  introduces  density  quantile  functions 
and  a  method  of  autoregressive  density  estimation  (Ref  65) 

Entropy  approaches  have  also  been  suggested  to 
estimate  probability  densities.  MacQueen  and  Marschak 
discuss  the  rationale  for  using  a  maximum  entropy  approach 
to  estimate  Bayesian  prior  distributions  (Ref  52).  Miller 
using  the  maximum  entropy  formalism  given  by  Tribus 
(Ref  95) ,  approximates  a  density  function  as  a  member  of 
the  exponential  family  of  distributions,  F .  Miller's 
approximations  are  shown  to  be  within  computational  accu¬ 
racy  when  the  underlying  distribution  is  a  member  of  F  and 
accurate  average  values  of  the  "information  functions"  are 
available  (Ref  57) . 


Estimator  Comparisons 

Having  examined  previous  distribution  and  density 
function  estimators,  we  now  wish  to  evaluate  the  new  non¬ 
parametric  estimators  proposed  in  Chapter  III.  We  begin 
by  examining  the  criteria  for  comparison.  Next  we  discuss 
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the  mechanics  of  the  Monte  Carlo  study.  Finally,  we  shall 
present  the  results  and  conclusions  of  the  comparisons. 


Criteria .  To  derive  the  various  variables  which 
make  up  our  models,  we  previously  used  a  modified  CVM 
integral  criterion.  Here  we  will  use  this  same  criterion 
to  evaluate  the  estimators.  As  mentioned  in  Appendix  1, 
this  modified  Cramer  von  Mises  integral  approximates  the 
average  square  error  and  mean  integrated  square  error 
(MISE)  with  weight  function  f. 

If  we  restrict  ourselves  to  the  family  of  con¬ 
tinuous  distribution  functions,  F,  which  can  be  parameter¬ 
ized  by  location  and  scale  parameters,  we  can  show  by  con¬ 
struction  that  SF (x)  belongs  to  F.  Further,  with  respect 
to  the  distribution  functions  as  the  arguments,  the  modi¬ 
fied  KS  integral,  modified  CVM  integral  and  modified 
Anderson-Darling  (AD)  integral  are  all  location  and  scale 
invariant.  When  the  density  functions  are  used  in  the  argu¬ 
ments  of  these  integrals,  location  invariance  is  preserved, 
but  scale  invariance  is  not.  For  example,  let  X  be  a 
random  variable  from  a  standard  normal  distribution.  Now 
let  Y  =  X/a.  Choose  a  random  sample  {X^}  i=l,...,n  and 
form  {Y.}  i=l,...,n.  Now  let  SF  (x)  and  sfv(x)  be  the 
nonparametric  approximations  based  on  the  sample  {X^} 
i=l,...,n,  and  similarly  for  Y.  Then 

/(fy(y)-sfY(y))2  dSFy (y )  =  a2  (x) -sfx (x) ) 2dSFx (x) . 
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Given  the  modified  CVM  integral  value  for  a  standardized 
distribution,  we  can  compute  the  integral  for  another 
random  variable  with  a  different  scale  factor  but  the  same 
distribution  type. 

Monte  Carlo  Mechanics.  With  our  criteria  defined 
we  now  generated  random  samples  via  the  methods  discussed 
in  Appendix  3.  Twenty-five  samples  of  sizes  20,  50,  100, 
175,  250  and  500  were  drawn  from  each  underlying  distribu¬ 
tion.  These  distributions  included  the  double  exponential 
normal,  uniform,  triangular,  Cauchy,  and  exponential.  To 
keep  a  consistent  comparison  with  other  published  results, 
the  uniform  and  triangular  distributions  were  defined  on 
[0,1].  All  other  distribution  functions  had  a  zero  loca¬ 
tion  parameter  and  unit  scale  parameter.  Each  random 
sample  was  compared  with  nonparametric  models  1  through  6. 
Values  for  both  the  MISE  of  the  distribution  function  and 
density  function  were  approximated  by  averaging  the  twenty 
five  modified  CVM  integrals.  A  standard  error  of  each 
estimate  was  also  calculated.  As  a  numerical  check,  the 
average  square  errors  were  also  calculated  and  were  in 
close  agreement  with  the  modified  CVM  criterion. 

Results .  Tables  IV. 1  through  IV. 8  summarize  the 
main  results  of  the  Monte  Carlo  study.  Although  a  small 
Monte  Carlo  sample  size  was  used,  relative  comparisons 
among  the  nonparametric  models  developed  here  can  be  made. 
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The  same  random  samples  were  used  to  calculate  the  modi¬ 
fied  CVM  integrals  for  each  model.  Tables  which  give 
approximate  MISE  also  include  the  standard  error  of  the 
estimate  beneath  each  entry  to  give  a  measure  of  the  Monte 
Carlo  accuracy. 

Table  IV. 1  shows  a  comparison  among  all  six  models 

using  the  approximate  MISE  of  the  distribution  function 

for  sample  size  100.  The  last  column  lists  the  mean  of 

the  asymptotic  distribution  of  the  Cramer  von  Mises  sta- 
2 

tistic,  W  ,  normalized  by  the  sample  size  (Ref  4  ) .  This 
value  is  the  MISE  of  the  distribution  function  when  the 
empirical  distribution  function  is  used  as  the  estimator. 
Note  that  in  all  cases  except  for  the  Cauchy  distribution. 
Models  1,  2  and  the  three  adaptive  models  outperform  the 
empirical  distribution  function  in  terms  of  MISE.  Given 
an  underlying  uniform  distribution.  Model  3  is  the  clear 
choice.  However,  its  poor  performance  for  other  distribu¬ 
tions  results  from  the  fixed  plotting  positions  based  on 
the  entire  sample.  The  excellent  performance  of  the 
adaptive  models  for  the  distributions  considered  is 
especially  encouraging.  These  results  indicate  that,  on 
the  average,  our  nonparametric  models  are  closer  to  the 
true  distribution  function  than  the  empirical  distribution 
function  under  the  criterion  of  mean  integrated  square 
error . 
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For  the  density  functions,  a  direct  comparison  of 


our  models  with  the  estimators  evaluated  by  Wegman  was 
made.  We  chose  only  to  repeat  the  two  continuous  density 
estimators  tested,  the  naive  estimator  based  on  a  uniform 
kernel  and  the  trigonometric  estimator  of  Kronmal  and 
Tarter.  For  average  square  error  values  of  histogram 
estimators,  refer  to  Wegman  (Ref  105)  .  Table  IV. 2  gives 
the  approximate  MISE  values  for  the  density  estimators. 

Note  the  competitive  performance  of  our  models  of  the 
density  functions.  No  one  estimator  is  clearly  superior. 
Again  the  performance  of  the  adaptive  models  is  encouraging. 

Remember  that  the  motivation  for  the  development 
of  this  new  nonparametric  family  of  estimators  was  based 
on  modeling  the  distribution  functions.  The  density  esti¬ 
mators  are  merely  analytic  derivatives  of  these  distribu¬ 
tion  functions.  Since  differentiation  is  an  unbounded 
linear  operator,  one  would  suspect  a  large  discrepancy 
between  a  differentiated  estimate  and  one  specifically 
designed  to  model  the  density  function  itself.  The  com¬ 
parable  performance  of  these  new  models  against  pure 
density  estimators  demonstrates  their  versatility. 

It  should  also  be  noted  that  the  trigonometric 
estimator  introduced  negative  density  values  in  samples 
from  the  normal,  Cauchy  and  exponential  distributions. 
Although  the  trigonometric  density  estimates  do  integrate 
to  unity  over  their  finite  support,  usually  the  interval 
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TABLE  IV 


[X^,X^nj],  their  utility  is  diminished  by  the  negative 
values.  Conversely,  both  the  kernel  estimator,  when  the 
kernel  itself  is  chosen  as  a  density  function,  and  all  of 
the  new  nonparametric  models  do  possess  all  the  properties 
of  distribution  functions. 

The  addition  of  the  exponential  distribution  as  an 
asymmetric  example  is  significant.  The  performance  of  the 
adaptive  models  for  both  the  distribution  function  and 
density  function  indicate  that  the  new  nonparametric 
approach  also  performs  well  over  a  very  general  class  of 
probability  distributions. 

A  further  comparison  of  the  density  estimators 
was  made  for  various  sample  sizes  using  the  triangular 
distribution.  Table  IV. 3  lists  the  values  of  the  approxi¬ 
mate  MISE  and  the  standard  errors.  The  competitive  nature 
of  the  new  models,  particularly  the  adaptive  ones,  is 
again  evident.  Tables  IV. 4  through  IV. 7  show  the  per¬ 
formance  of  Models  5  and  6  for  various  sample  sizes  and 
distributions.  Both  the  MISEs  for  the  distribution  func¬ 
tion  and  the  density  function  are  compared.  Tables  IV. 4 
and  IV. 6  include  the  mean  of  the  asymptotic  distribution 
of  the  normalized  CVM  statistic  as  a  reference.  These 
two  models  are  significant  in  that  they  will  form  the 
bases  for  goodness  of  fit  tests  proposed  in  the  next 
chapter . 
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Based  on  the  calculated  criterion  values,  we 
derived  empirical  convergence  rates  for  five  of  the  models. 


Normalized  to  criterion  values  at  sample  size  50, 

Table  IV. 8  compares  the  empirical  rates  to  convergence 
—  5  —  8  —1 

rates  of  order  n  '  ,  n  ’  ,  and  n  .  The  distribution 

function  models  appear  to  converge  at  a  rate  near  n  ^ . 

This  empirical  result  indicates  that  the  smoothing  process 

introduced  in  Chapter  III  does  not  appreciably  affect  the 

convergence  of  the  estimators.  Recall  that  the  unsmoothed 

estimators  displayed  uniform  convergence.  Now,  we  have 

empirical  evidence  of  the  convergence  of  our  distribution 

function  models.  The  density  function  estimates  appear 

—  5  —  8 

to  converge  at  a  rate  between  n  *  and  n  '  .  This  rate 
is  not  as  rapid  as  the  theoretical  convergence  rate  of  the 
kernel  estimate  given  by  Rosenblatt  or  the  approximate 
convergence  rate  for  the  trigonometric  estimate  given  by 
Wegman  (Refs  75  and  105)  .  However,  we  have  demonstrated 
empirical  convergence  of  our  density  estimators,  a  property 
not  analytically  verifiable  due  to  the  differentiation 
operation.  While  the  convergence  rates  appear  somewhat 
slower,  the  previous  tables  show  that  the  actual  criterion 
values  of  our  model  estimators  are  very  close  to  the 
methods  currently  available.  Further,  the  use  of  nonpara- 
metric  estimates  for  very  large  samples  is  a  questionable 
procedure.  Large  samples  are  ideally  suited  to  a  para¬ 
metric  approach,  since  the  amount  of  information  available 
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EMPIRICAL  CONVERGENCE  RATES 


should  provide  model  discrimination.  Thus,  all  of  the 
results  of  this  analysis  supports  the  use  of  the  new  non- 
parametric  models  for  small  and  intermediate  sample  sizes. 
The  results  of  investigations  of  samples  of  size  20  indi¬ 
cate  that  the  strength  of  these  models  may  lie  in  small 
sample  analysis. 

Graphical  Comparisons 

Much  of  the  impetus  for  this  research  resulted 
from  the  ability  to  analyze  many  different  random  samples 
graphically.  For  criteria  such  as  MISE,  the  accuracy  of 
the  approximations  becomes  obscured  when  dealing  with  such 
small  quantities,  at  least  for  this  author.  MISE  is  also 
an  average  error,  so  a  graphical  approach  may  give  more 
insight  as  to  the  influence  that  various  portions  of  the 
density  have  on  the  mean  value.  For  example,  a  graphical 
analysis  showed  that  while  the  MISE  of  the  density  function 
for  the  exponential  distribution  using  Model  3  was  far 
superior,  the  poor  estimation  of  tail  values  resulted  in 
an  extremely  poor  distribution  function  MISE.  This  observa¬ 
tion  calls  to  question  the  widely  accepted  use  of  MISE 
as  a  density  function  estimation  criterion.  Relying  solely 
on  MISE  for  the  density  function  allows  very  poor  esti¬ 
mators  to  appear  quite  good.  Throughout  this  study,  we 
have  contended  that  density  estimators  should  be  compared 
with  respect  to  criteria  evaluation  at  their  corresponding 
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distribution  functions  as  well  as  at  the  density  function. 

A  graphical  examination  is  a  simple  way  to  expose  these 
ill-conceived  estimators. 

To  demonstrate  the  versatility  of  the  new  non- 
parametric  estimators,  we  chose  random  samples  of  size  100 
from  the  double  exponential,  uniform,  triangular,  Cauchy, 
and  exponential  distributions.  The  nonparametric  model 
used  in  each  case  is  the  one  with  the  smallest  approximate 
MISE  listed  in  Table  IV. 1.  Figures  4.1  through  4.10 
present  the  distribution  function  and  density  function 
approximations  plotted  against  the  true  underlying  pro¬ 
cesses.  Table  IV. 9  lists  the  values  of  the  approximate 
MISEs  for  the  distribution  and  density  functions  for  each 
random  sample.  Many  other  samples  and  distribution  func¬ 
tions  have  been  examined  for  different  sample  sizes.  Other 
probability  distributions  analyzed  included  various  beta 
distributions,  including  U  shapes,  Weibull  distributions, 
gamma  distributions,  and  extreme  value  distributions. 

Hazard  Function  Estimation 

The  availability  of  a  continuous  density  function 
estimator  derived  from  a  continuous,  differentiable  dis¬ 
tribution  function  estimator  automatically  allows  one  to 
calculate  a  continuous  hazard  function  estimator.  The 
hazard  function,  defined  by  h (x) =f (x) / ( 1-F (x) ) ,  can  be  a 
powerful  density  function  discriminant  and  is  used 
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Figure  4.1.  Double  Exponential  CDF  vs  Model 
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Figure  4.2.  Double  Exponential  PDF  vs  Model 


Figure  4.3.  Uniform  CDF  vs  Model 
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Figure  4.4.  Uniform  PDF  vs  Model 
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Figure  4.5.  Triangular  CDF  vs  Model 


Figure  4.6.  Triangular  PDF  vs  Model 
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Figure  4.9.  Exponential  CDF  vs  Model 
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Figure  4.10.  Exponential  PDF  vs  Model 


TABLE  IV. 9 


r 


APPROXIMATE  MISE- 

-RANDOM  SAMPLES- 

-SAMPLE 

SIZE  100 

MISE 

Distribution 

Distribution 

Function 

Density 

Function 

Double  Exponential 

.00044 

.00352 

Uniform 

.00054 

.00125  n  . 
( .01500),'U 

Triangular 

.00170 

.00150  ... 

( .02403)U' 

Cauchy 

.00331 

.00058 

Exponential 

.00031 

.00786 

Note  1:  Density  function  MISE  normalized  to  the 
interval  [0,1] . 


extensively  in  reliability  engineering  and  life  testing. 
Early  research  in  hazard  analysis  was  done  by  Watson  and 
Ledbetter,  which  prompted  their  later  investigation  of 
density  estimation  (Ref  103)  .  An  empirical  approach  to 
hazard  function  estimation  can  take  the  form  of  estimating 
the  hazard  function  at  the  sample  data  points  and  fitting 
some  least  squares  curve  through  the  calculated  points 
(Ref  44) .  Because  of  the  necessity  of  using  a  differencing 
scheme  to  construct  the  density  function  estimate,  the  cal¬ 
culated  hazard  point  estimates  have  magnified  errors.  The 
use  of  a  continuous  density  approximation  has  a  clear 
advantage . 
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Using  the  same  models  as  the  CDF  and  PDF  plots, 
we  constructed  the  hazard  function  estimates  for  the  random 
samples  plotted  in  the  last  section.  Figures  4.11  through 


4.15  show  the  estimators  plotted  versus  the  true  popula¬ 
tion  hazard  function.  The  functions  are  only  plotted 
between  the  first  and  last  order  statistic.  Note  the 
unique  shape  of  each  hazard  function  and  the  ability  of 
the  nonparametric  estimator  to  follow  the  shape. 

Armed  with  only  the  new  nonparametric  estimators 
and  graphs  of  various  distribution,  density,  and  hazard 
functions,  we  now  have  a  powerful  tool  for  identifying 
the  underlying  distribution  of  the  population  from  which 
a  random  sample  is  drawn. 

Summary 

We  began  our  investigation  into  the  utility  of 
our  new  nonparametric  estimators  by  surveying  the  litera¬ 
ture  for  other  distribution  and  density  estimators.  A 
Monte  Carlo  study  was  then  described  in  which  the  new 
models  were  compared  with  established  estimation  schemes. 
The  new  estimators  were  very  competitive  in  the  mean 
integrated  square  error  sense.  Tables  were  developed 
showing  the  approximate  MISE  and  standard  error  of  the 
estimate.  Based  on  these  values,  empirical  convergence 
rates  were  indicated.  We  next  discussed  a  graphical  com¬ 
parison  of  various  random  samples  from  five  different 
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Figure  4.11.  Double  Exponential  Hazard  Function  vs  Model 
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Figure  4.13.  Triangular  Hazard  Function  vs  Model 


Figure  4.14.  Cauchy  Hazard  Function  vs  Model 
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Figure  4.15.  Exponential  Hazard  Function  vs  Model 


distributions.  We  concluded  with  the  development  of  an 
approximation  to  the  hazard  function,  illustrated  the 
hazard  estimator  for  the  five  distributions,  and  argued 
for  the  simultaneous  use  of  distribution,  density,  and 
hazard  function  graphs  in  solving  problems  in  model  dis¬ 
crimination. 

We  have  demonstrated  that  our  models  are  extremely 
competitive  and  closely  approximate  the  true  distribution 
function  and  density  function.  Their  use  as  a  population 
discriminant  will  be  considered  next  in  the  development 
and  evaluation  of  goodness  of  fit  tests  based  on  the  new 
nonparametric  estimators. 


V.  Goodness  of  Fit  Tests 


Introduction 

Since  the  last  chapter  indicated  that  our  models 
approximated  the  true  underlying  distribution  with  competi¬ 
tive  precision,  we  will  now  use  them  as  a  basis  for  goodness 
of  fit  tests.  We  begin  our  discussion  by  a  brief  his¬ 
torical  survey  of  goodness  of  fit  tests.  Next  we  intro¬ 
duce  eight  new  test  statistics  based  on  two  of  the  adap¬ 
tive  models  and  a  sample  distribution  step  function 
related  to  the  median  ranks.  Then,  we  give  the  critical 
values  of  tests  for  the  normal  and  extreme  value  distribu¬ 
tion  for  both  a  completely  specified  null  distribution  and 
a  null  distribution  whose  parameters  are  estimated. 

Finally  we  present  the  results  of  power  studies  for  both 
tests.  Powers  are  also  compared  with  some  previously  pub¬ 
lished  methods. 

Historical  Survey 

Goodness  of  fit  test  literature  has  not  suffered 
from  lack  of  attention.  In  our  discussion,  we  are  con¬ 
cerned  with  the  goodness  of  fit  problem  in  the  context  of 
life  testing.  Two  important  distributions  used  in  life 
testing  are  the  normal  and  the  extreme  value. 
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Forming  the  basis  for  goodness  of  fit  tests  is  the 
selection  of  a  test  statistic.  An  excellent  survey  of  dis 
tribution  free  statistics  is  given  by  Sahler  (Ref  78). 
Consider  now,  some  of  the  tests  based  in  the  statistics 
for  the  case  of  a  completely  specified  null  hypothesis. 
References  in  Sahler' s  survey  give  much  of  the  historical 
background . 

To  avoid  using  extensive  tables,  Stephens  proposed 
computational  approximations  for  critical  values  of  eleven 
common  test  statistics  (Ref  88) .  Schuster  uses  a  modified 
empirical  distribution  function  to  develop  a  test  based 
on  the  Kolmogorov  Smirnov  statistic  (Ref  82).  Saniga  and 
Miles  evaluate  some  standard  tests  of  normality  against 
an  alternative  distribution  which  is  a  member  of  the 
asymmetric  stable  probability  distribution  family  (Ref  80) 
Tests  of  symmetry  have  been  proposed  using  the  Cramer 
von  Mises  statistic  and  modified  empirical  distribution 
functions  by  Rothman  and  Woodroofe  and  Hill  and  Rao 
(Refs  36,  76).  For  the  Weibull  distribution,  or  equi¬ 
valently  the  extreme  distribution  value.  Smith  and  Bain 
propose  a  goodness  of  fit  test  based  on  the  correlation 
coefficient  and  evaluate  both  complete  and  censored  sample 
in  both  the  completely  specified  and  composite  hypothesis 
cases  (Ref  87).  Foutz  attempts  a  more  general  approach 
to  goodness  of  fit  testing  by  using  an  empirical  proba¬ 
bility  measure  as  a  basis  rather  than  the  empirical 


distribution  function  (Ref  25).  A  novel  approach  of 
Dudwicz  and  van  der  Meulen  uses  entropy  as  the  basis  for 
a  test  of  uniformity  (Ref  20) .  Extensions  to  other  dis¬ 
tributions  have  not  been  published  as  yet. 

While  the  aforementioned  tests  all  use  a  completely 
specified  null  hypothesis,  the  work  of  David  and  Johnson 
shows  that  goodness  of  fit  tests  are  independent  of  the 
true  parameter  values  when  invariant  location  and  scale 
estimates  are  substituted  and  the  test  depends  on  the 
probability  integral  transform  (Ref  18 ) .  This  result 
opened  the  door  for  composite  null  hypothesis  tests  which 
estimate  the  parameters  of  the  distribution  by  invariant 
estimators.  Lilliefors  pioneered  the  investigations  of 
this  type  of  developing  tables  for  the  KS  statistic 
(Ref  50).  Stephens  conducted  tests  for  uniformity,  nor¬ 
mality  and  exponential ity  using  modifications  of  the  KS, 
CVM,  AD,  Kuiper  and  Watson  statistics  when  the  parameters 
were  estimated  (Ref  89) .  Green  and  Hegazy  modify  the  KS, 
CVM,  and  AD  tests  by  using  other  sample  distribution  func¬ 
tions  as  a  basis  for  the  test  statistics.  Their  results 
show  improvements  in  powers  are  possible  when  new  sample 
distribution  functions  are  used  (Ref  29 ) .  Durbin  proposes 
a  generalized  KS  test  when  parameters  are  estimated  and 
applies  the  result  to  tests  of  exponentiality  and  spacings 
(Ref  21).  Durbin's  results  were  based  in  part  on  the 
investigation  of  spacings  done  by  Pyke  (Ref  69).  Pyke's 
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work  also  motivated  Mann,  Scheuer  and  Fertig's  development 
of  two  new  statistics,  L  and  S.  They  proposed  tests  based 
on  these  statistics  for  the  two  parameter  Weibull  or 
extreme  values  distribution  (Ref  53).  Littell,  McClave, 
and  Offen  conducted  power  studies  using  the  S  statistic 
as  well  as  four  others  for  these  same  distributions 
(Ref  51).  Stephens,  following  methods  developed  pre¬ 
viously,  computed  critical  values  of  modified  CVM,  AD  and 
Watson  statistics  for  tests  of  the  extreme  value  distribu¬ 
tion  (Ref  90)  .  A  recent  paper  by  Mihalko  and  Moore  shows 
an  application  of  a  chi  square  test  goodness  of  fit  test 
to  the  two  parameter  Weibull  when  the  parameters  are  esti¬ 
mated  (Ref  56 ) . 

Test  Procedures 

The  classical  goodness  of  fit  test  can  be  stated 
as  follows:  from  an  observed  random  sample,  X^,...,Xn# 
test  whether  the  sample  comes  from  a  population  with  dis¬ 
tribution  function  F(x) .  Standard  tests  using  EDF  or 
modified  EDF  statistics  are  based  on  comparisons  between 
F(x)  and  some  sample  distribution  function.  As  we  have 
generated  new  continuous,  differentiable,  sample  distribu¬ 
tion  functions,  we  follow  a  similar  approach  to  define  our 
goodness  of  fit  tests.  Because  of  their  outstanding  per¬ 
formance  using  a  mean  integrated  square  error  criterion 
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over  a  wide  range  of  distributions,  we  chose  Models  5  and  6 
to  form  the  bases  for  our  new  tests. 

Null  Distributions  and  Situations  Considered.  One 
of  the  major  applications  of  goodness  of  fit  tests  is  in 
the  area  of  life  testing.  For  this  reason,  we  chose  two 
important  and  widely  used  failure  distribution  models,  the 
normal  and  the  extreme  value  distributions,  for  our  null 
hypotheses . 

The  extreme  value  distribution  considered  in  this 
entire  analysis  is  the  distribution  of  the  largest  value, 
whose  cumulative  distribution  function  is  given  by: 

F  (x)  =  exp  [-exp  {-  <^^)}] 

where  -°°<x<°°,  -oo<iS<oDf  a>0 

Two  specific  hypotheses  situations  will  also  be 
considered.  The  first  is  the  classical  case  of  the  null 
distribution,  F(x),  having  all  of  its  parameters  com¬ 
pletely  specified.  The  second  situation,  and  probdbly 
the  more  common  one  for  the  applied  statistician,  is  the 
case  where  the  functional  form  of  the  null  distribution  is 
hypothesized,  but  the  parameters  are  estimated.  Although 
both  the  normal  and  extreme  value  distributions  are  members 
of  a  two  parameter  family,  we  chose  not  to  examine  the 
situation  where  only  one  parameter  is  estimated  and  the 
other  specified.  We  believe  that  the  two  situations 
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considered  here  comprise  the  vast  majority  of  cases  encoun¬ 
tered  in  actual  practice. 

The  estimators  used  in  the  case  of  the  normal  dis¬ 
tribution  will  be  the  uniformly  minimum  variance  unbiased 
estimates,  X  and  S.  For  the  extreme  value  we  will  employ 
a  Newton  Raphson  iteration  technique  to  calculate  the 
maximum  likelihood  estimators  of  the  location  and  scale 
parameters . 

Test  Statistics .  Eight  new  test  statistics  are 
proposed.  The  first  set  of  these  statistics  is  based  on 
Models  5  and  6  and  the  modified  distance  measures  listed 
in  Appendix  1.  Given  the  random  sample,  X^,...,X  ,  let 
SF (x)  be  based  on  Model  5.  Now  define 

D5  =  max  |  F(X.)  -  SF(X.)  | 

^  i  l  1 

00 

W5  =  n  f  (SF(x)-F(x)  )  2  dSF(x) 

—  00 

00 

A5  =  nf  (SF(x)-F(x) )2[SF(x) (l-SF(x))]  dSF(x) 

—  00 

Calculating  SF(x)  using  Model  6  gives  similar  definitions 
for  D6,  W6 ,  and  A6 .  These  first  six  test  statistics  are 
modifications  of  the  classical  KS,  CVM  and  AD  statistics. 

Along  the  lines  of  the  tests  proposed  by  Green  and 
Hegazy,  we  also  propose  two  new  test  statistics  based  on 
a  sample  distribution  step  function  (Ref  29 ) .  We  wanted  to 
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use  the  median  ranks  in  both  a  KS  and  CVM  statistic 


since,  as  plotting  positions,  they  describe  measures  of 
central  tendency  for  the  mostly  skewed  rank  distributions. 
The  aim  was  to  get  the  squared  term  in  the  summation  for 
the  CVM  statistic  to  contain  the  difference  between  the 
hypothesized  distribution  function  at  that  point  and  the 
median  rank  value.  Working  backwards,  one  sample  dis¬ 
tribution  that  will  suffice  is  F  (x) ,  where 

n 


Fn (x>  " 


.2  / (n+ .4 ) 
( i+ . 2) / (n+.4) 
(n+ . 2) / (n+ .4 ) 
( i— -  3 ) / (n+.4) 


x<X 


(1) 


X(i)<x<X(i+l)  i=1' 


,n-l 


x>X 


x=X 


(n) 

(i) 


i=l , .  .  .  ,n 


(5.1) 


Note  that  F^tX^  is  the  midpoint  of  the  jump  from 


F  (X .  )  to  F  (X .  ) 
n  i  n  i 


We  now  define  two  new  statistics  based  on  this 


Fn(x) . 


DMR  =  max  |  FIX.)  - 

1  i  n+.4 

1 


and 


WMR 


n 


12 (n+ .4 ) 


3  + 


n 

n+ .  4 


n 

l  (F  (X  . ) 
i=l  1 


i-  .  3 
n+  .4 


2 


) 


Critical  Values .  Given  the  two  distributions  and 
two  situations  for  the  null  hypothesis  and  the  eight  new 
goodness  of  fit  statistics,  we  now  generated  critical 
values  for  each  test  statistic  by  the  following  method. 
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For  fixed  sample  sizes  of  10(10)50  we  generated  n  ordered 
random  variates  from  the  null  distribution  (see  Appendix  5 
for  a  further  discussion  of  random  variate  generation) . 

We  next  calculated  the  approximate  parameter  estimates 
from  the  random  sample.  Finally,  we  calculated  each  of 
the  eight  new  test  statistics  for  this  sample.  The  pro¬ 
cedure  was  repeated  1000  times  and  values  for  each  test 
statistic  were  ordered.  Percentiles  corresponding  to  alpha 
levels  of  .20,  .15,  .10,  .05,  .025,  and  .01  were  deter¬ 
mined.  The  entire  process  was  then  repeated  five  times 
and  the  critical  values  for  each  test  statistic,  at  each 
sample  size  and  alpha  level  were  calculated  by  averaging 
the  five  corresponding  percentiles.  Appendix  3  gives  the 
tables  for  the  critical  values  for  the  normal  and  extreme 
value  distributions,  both  when  the  null  distribution  is 
completely  specified  and  when  the  parameters  are  estimated. 
Values  are  listed  for  five  different  sample  sizes  and  six 
different  alpha  levels. 

Tables  V. 1  and  V.2  show  the  critical  values  across 
sample  sizes  and  compares  the  eight  new  test  statistic 
values  with  the  classical  values  for  the  KS,  CVM  and  AD 
statistics  for  a  completely  specified  null  hypothesis. 

Note  the  smaller  values  of  the  critical  values  for  the  new 
statistics  (except  A5  and  A6  for  sample  size  £30).  This 
observation  strengthens  the  claim  made  earlier  that  our 
new  nonparametric  model  "better"  approximates  the  true 
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TABLE  V.l 


COMPARISON  OF  CRITICAL  VALUES  FOR  THE  NORMAL 
DISTRIBUTION  AT  THE  5-PERCENT  ALPHA  LEVEL 


Statistic 

Sample  Size 

10 

20 

30 

40 

d(2) 

.4094 

.2941 

.2418 

.2102 

.1884 

D5 

.314  7 

.2160 

.1738 

.1511 

.1323 

D6 

.3108 

.2228 

.1765 

.1543 

.1349 

DMR 

.3509 

.  2687 

.2211 

.1963 

.1748 

w2<2) 

.5411 

.5026 

.4890 

.4822 

.4780 

W5 

.4513 

.4267 

.4067 

.4101 

.3998 

W6 

.4243 

.4271 

.4068 

.4137 

.4070 

WMR 

.4258 

.4550 

.4365 

.4610 

.4510 

A2  ( 2> 

2.492 

2.492 

2.492 

2.492 

2.492 

A5 

4.416 

2.907 

2.556 

2.367 

2.175 

A6 

4.013 

2.837 

2.563 

2.388 

2.218 

Note  1:  Null  distribution  is  completely  specified. 

Note  2:  Critical  values  calculated  from  formulae 
given  by  Stephens  (Ref  89)  . 
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TABLE  V . 2 


COMPARISON  OF  CRITICAL  VALUES  FOR  THE  EXTREME  VALUE 
DISTRIBUTION  AT  THE  5-PERCENT  ALPHA  LEVEL ( ^ 


Statistic 

Sample  Size 

10 

20 

30 

40 

50 

d<2> 

.4094 

.2941 

.2418 

.2102 

.1884 

D5 

.3256 

.2183 

.1751 

.1531 

.1363 

D6 

.3205 

.2111 

.1764 

.1542 

.1376 

DMR 

.3536 

.2661 

.2221 

.1953 

.1769 

W2<2) 

.5411 

.5026 

.4890 

.4822 

.4780 

W5 

.4802 

.4530 

.4213 

.4171 

.4239 

W6 

.4444 

.4363 

.4128 

.4152 

.4242 

WMR 

.4284 

.4491 

.4317 

.4473 

.4537 

a2(2) 

2.492 

2.492 

2.492 

2.492 

2.492 

A5 

4.516 

3.111 

2.587 

2.398 

2.34  5 

A6 

4.104 

3.014 

2.572 

2.367 

2.343 

Note  1:  Null  distribution  is  completely  specified. 

Note  2:  Critical  values  calculated  from  formulae 
given  by  Stephens  (Ref  89  )  . 
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distribution  than  the  EDF.  "Better"  is  now  in  terms  of 
KS,  CVM  and  AD  distance  measures.  Since  each  criterion 
for  closeness  of  the  true  and  approximated  functions  mea¬ 
sures  different  qualities  of  the  approximation,  our  dis¬ 
tribution  and  density  approximations  of  the  last  chapter 
gain  more  credibility. 

While  small  critical  values  do  indicate  a  high 
quality  approximation,  the  real  performance  of  a  goodness 
of  fit  test  is  measured  by  its  power. 

Power  Comparisons 

Once  the  critical  values  were  determined,  we  next 
evaluated  the  power  of  our  new  tests  using  various  alterna¬ 
tive  distributions.  Our  first  concern  was  the  verification 
of  our  critical  values  for  both  distributions  over  all 
cases  considered.  Monte  Carlo  samples  of  size  1000  for  the 
normal  distribution  and  2000  for  the  extreme  value  distribu 
tion  were  generated  for  each  random  sample  size  of  10(10)50 
Tables  V.3  and  V.4  show  the  results  of  the  critical  value 
verifications  at  sample  size  20  with  the  parameters  of  the 
null  distributions  estimated.  All  of  the  results  indi¬ 
cated  a  good  agreement  between  the  alpha  level  and  the 
power  of  the  test  using  random  samples  generated  by  the 
null  distribution.  Thus,  the  critical  values  were 
empirically  confirmed. 


TABLE  V. 3 


CRITICAL  VALUE  VERIFICATION  FOR  THE  NORMAL 
DISTRIBUTION  AT  SAMPLE  SIZE  20 


Alpha  Level 


Statistic 

.20 

ID 

•H 

• 

.10 

.05 

.025 

.01 

D5 

201 

156 

105 

53 

26 

14 

D6 

195 

14  7 

94 

51 

25 

13 

DMR 

199 

151 

106 

46 

23 

9 

W5 

202 

156 

102 

52 

24 

14 

W6 

189 

150 

101 

56 

23 

10 

WMR 

185 

143 

91 

49 

27 

14 

A5 

201 

155 

108 

51 

24 

14 

A6 

209 

157 

107 

52 

27 

15 

Entries  represent  the  number  of  samples  signifi¬ 
cant  at  the  given  alpha  level  for  each  test  statistic 
calculated  over  a  Monte  Carlo  sample  of  size  1000.  The 
parameters  of  the  null  distribution  were  estimated. 
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TABLE  V . 4 


CRITICAL  VALUE  VERIFICATION  FOR  THE  EXTREME  VALUE 
DISTRIBUTION  AT  SAMPLE  SIZE  20 


Alpha  Level 


Statistic 

.20 

.15 

.10 

.05 

.025 

.01 

D5 

410 

308 

201 

85 

41 

12 

D6 

395 

28  2 

188 

94 

35 

10 

DMR 

410 

328 

228 

111 

52 

15 

W5 

405 

305 

204 

87 

42 

14 

W6 

399 

310 

202 

89 

43 

10 

WMR 

389 

296 

209 

107 

51 

13 

A5 

401 

303 

192 

89 

42 

22 

A6 

405 

311 

192 

92 

42 

15 

Entries  represent  the  number  of  samples  signifi¬ 
cant  at  the  given  alpha  level  for  each  test  statistic 
calculated  over  a  Monte  Carlo  sample  of  size  2000.  The 
parameters  of  the  null  distribution  were  estimated. 
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The  general  method  followed  in  the  power  studies 
was  to  generate  1000  sets  of  random  samples  of  size 
10(10)50  for  each  alternative  distribution.  Then,  the 
eight  test  statistics  were  calculated  for  each  sample. 

The  number  of  samples,  for  each  sample  size,  which  had  test 
statistics  that  exceeded  the  critical  values,  was  recorded. 
For  a  given  alternate  distribution,  situation  type,  sample 
size,  alpha  level,  and  test  statistic,  the  power  of  the 
test  is  the  number  of  samples  significant  divided  by  1000, 
the  Monte  Carlo  size.  Appendix  4  gives  the  results  of 
some  of  the  power  studies  for  both  null  distributions,  the 
normal  and  extreme  value.  The  cases  evaluated  but  not 
tabled  include  all  of  the  results  for  alpha  levels  .20, 

.15,  and  .025.  Several  alternative  distributions  were  not 
included  in  the  tables  but  are  discussed  later  in  this 
chapter  when  each  null  distribution  is  examined.  However, 
the  tables  do  present  the  results  for  the  most  commonly 
used  alpha  levels  and  alternative  distributions  which  pro¬ 
vide  variety  and  a  basis  for  future  comparisons. 

Because  of  the  similarity  between  Models  5  and  6, 
the  correlation  between  the  new  test  statistics  should  be 
rather  high.  To  gain  some  insight  into  the  correlations 
between  all  pairs  of  test  statistics,  over  1400  output 
matrices  similar  to  Table  V.5  were  constructed  for  each 
null  distribution,  hypothesis  situation,  sample  size,  alpha 
level,  and  each  alternative  distribution.  Each  cell  of 
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TABLE  V . 5 

TYPICAL  OUTPUT  MATRIX  OF  POWER  STUDIES 


Null  Distribution — Extreme  Value,  Parameters  Estimated 

Alternative  Distribution — Normal 

Sample  Size--20 

Alpha  Level--. 10 

Statistic 

D5 

D6 

DMR 

W5 

W6 

WMR 

A5  A6 

D5 

490 

D6 

399 

409 

DMR 

225 

221 

252 

W5 

468 

391 

221 

491 

W6 

416 

3  76 

218 

417 

419 

WMR 

26  5 

267 

209 

264 

264 

280 

A5 

435 

375 

214 

446 

402 

258 

4  71 

A6 

399 

357 

206 

404 

378 

252 

420  438 

Entries  represent  the  number  of  samples  significant 
by  both  row  and  column  statistics  using  a  Monte  Carlo 
sample  of  size  1000. 
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the  matrix  contains  the  number  of  samples  significant  by 
the  corresponding  row  and  column  statistics.  Diagonal 
terms  were  used  to  construct  the  power  tables  in  Appendix  4. 

Normal  Distribution .  Tables  A4.1  through  A4.6 
in  Appendix  4  list  the  results  of  the  power  study  conducted 
for  the  normal  distribution.  We  attempted  to  construct  a 
meaningful  alternative  distribution  when  the  null  distribu¬ 
tion  parameters  were  completely  specified.  Sometimes  the 
null  distribution  parameters  were  adjusted  for  simplicity. 
Eleven  alternative  distributions  were  considered. 

For  the  double  exponential,  uniform,  and  Cauchy 
distributions,  the  location  and  scale  parameters  of  the 
null  and  alternative  distributions  were  zero  and  one 
respectively.  For  the  exponential,  gammas,  and  extreme 
value,  the  null  distribution  was  modified  to  have  the  same 
mean  and  variance  as  the  standard  form  of  the  alternative 
distribution.  For  example,  the  exponential  distribution 
had  a  location  parameter  of  zero  and  a  scale  parameter  of 
one,  while  the  normal  distribution  as  the  null  distribution 
had  location  and  scale  parameters  equal  to  one.  The 
lambda  distributions  had  zero  mean  and  unit  variance  as 
did  the  corresponding  normal  as  the  null  distribution. 

See  Ramberg,  et  al . ,  for  a  discussion  of  the  four  parameter 
lambda  distribution  (Ref  72). 
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Table  V.6  lists  selected  results  of  the  power 
study.  Parameters  for  the  null  distribution  have  been 
estimated  and  only  the  results  for  an  alpha  level  of  .05 
are  shown.  The  powers  for  the  three  lambda  distributions 
are  included  for  comparison  purposes.  These  three  dis¬ 
tributions  are  not  included  in  the  general  tables  of 
Appendix  4.  To  facilitate  comparisons  of  our  results  with 
other  published  power  studies,  we  included  the  classical 
KS,  CVM,  and  AD  statistics  (listed  as  D,  WQ  and  A  respec¬ 
tively)  as  well  as  two  modified  EDF  statistics  D2  and  A22* 
D2  is  a  summed  KS  distance  between  the  hypothesized  dis¬ 
tribution  and  the  EDF  (summed  over  the  data  points).  A22 
is  equal  to  n  times  the  Anderson-Darling  integral  distance 
listed  in  Appendix  1  after  H^fx)  is  substituted  for  SF(x) 
where 

Hn(x)  =  ( i+H) / (n+1)  X(i)lx<X(i+i)  i=l , . . . ,n 

See  reference  29  for  a  further  discussion  of  these  two 
statistics.  Note  that  these  five  test  statistics  used 
for  comparison  had  powers  calculated  using  different  random 
samples  than  the  ones  used  to  calculate  the  powers  for  the 
eight  new  test  statistics. 

Several  observations  deserve  mention.  First,  the 
tests  based  on  Models  5  and  6  are  superior  in  almost  every 
instance  to  the  tests  based  on  median  ranks.  Second,  for 
the  gamma  alternatives,  it  appears  that  D2  and  A^2  have  a 
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SELECTED  POWER  COMPARISONS  FOR  THE  NORMAL  DISTRIBUTION 
AT  THE  5-PERCENT  ALPHA  LEVEL 
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distinct  advantage  over  the  new  tests.  Again,  however, 
caution  is  advised  since  the  underlying  random  samples 
were  different.  Third,  with  the  further  exception  of  the 
uniform,  the  new  tests  based  on  Models  5  and  6  have  very 
competitive  powers. 

Extreme  Value  Distribution .  Tables  A4 . 7  through 
A4.12  in  Appendix  4  list  the  results  of  the  power  study 
conducted  for  the  extreme  value  distribution.  An  attempt, 
as  in  the  normal  power  study,  was  made  to  construct  mean¬ 
ingful  alternative  distributions  when  the  null  distribution 
parameters  were  completely  specified.  Twelve  alternative 
distributions  were  considered. 

For  the  normal,  uniform  and  double  exponential  dis¬ 
tributions,  the  location  and  scale  parameters  were  the  mean 
and  the  square  root  of  the  variance  of  a  standard  extreme 
value  distribution.  The  null  distribution  had  zero  loca¬ 
tion  parameter  and  unit  scale  parameter.  For  the  exponen¬ 
tial,  logistic  and  gamma  distributions,  location  and 
scale  parameters  for  both  null  and  alternative  distribu¬ 
tions  were  set  to  zero  and  one  respectively.  As  such, 
powers  shown  for  the  exponential  appear  quite  high  in  the 
completely  specified  case.  Power  comparisons  for  the  gamma 
distributions  with  shape  parameters  2,  4  and  6  were  made 
but  are  not  listed  in  Appendix  4.  Also  not  listed  in 
Appendix  4  are  the  results  of  the  power  study  for  the  four 
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parameter  lambda  distribution  with  skewness  equal  to  one 

and  kurtosis  equal  to  four.  Random  variables  from  chi 

square  distributions  with  one  degree  and  four  degrees  of 

freedom  were  also  generated.  Taking  minus  the  natural 

logarithm  of  these  random  variables  generates  samples  to 

compare  against  the  extreme  value  distribution  which  are 

analogous  to  testing  chi  square  random  samples  against  a 

two  parameter  Weibull  distribution.  Although  listed  as 
2 

X  distributions,  it  should  be  noted  that  the  actual  com¬ 
parison  for  the  power  determination  was  made  between 
2 

-  In  (x  )  and  the  extreme  value  distribution. 

Table  V.7  lists  selected  results  of  the  extreme 

value  power  study.  Parameters  for  the  null  distributions 

have  been  estimated  and  only  the  results  for  an  alpha  level 

of  .05  are  shown.  Parts  of  Table  III  of  reference  51  are 

included  to  allow  for  comparisons  to  be  made.  However, 

again  caution  is  advised  since  the  random  samples  which 

generated  both  sets  of  powers  were  different.  The  values 

listed  from  reference  51  are  rounded  to  compare  with  a 

2  2 

Monte  Carlo  sample  of  size  1000.  The  D,  W  and  A  are  the 
standard  KS,  CVM  and  AD  test  statistics.  T  is  Smith  and 
Bain's  correlation  statistic  and  S  is  Mann,  Scheuer  and 
Fertig's  statistic.  Both  were  referenced  earlier  in  this 
chapter. 

We  note  several  trends.  Again  we  detect  the 
inferior  performance  of  tests  based  on  the  median  ranks 
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Note  1:  Values  for  these  statistics  were  taken  from  reference  51,  Table  III 


as  compared  to  the  corresponding  tests  using  Models  5 
and  6.  Note  that  every  test  based  on  Models  5  and  6  is 
superior  to  all  tests  reported  by  Littell,  McClave  and  Of fen 
for  the  normal,  double  exponential,  and  logistic  alterna¬ 
tives.  Results  for  the  uniform  and  exponential  show  the 

superiority  of  A5  and  A6 .  Comparisons  for  the  Cauchy  indi- 

2 

cate  all  test  statistics  are  competitive.  The  x  results 

exhibit  a  curious  behavior.  Like  the  T  and  S  statistics, 

D5,  W5  and  A5  all  show  powers  below  the  alpha  level  for 

some  sample  size.  Thus,  it  appears  that  the  statistics 

2 

based  on  Model  5  are  biased  toward  the  distribution. 

This  same  phenomena  occurred  in  all  eight  test  statistics 

when  the  alternative  distribution  was  a  gamma  with  shape 

parameter  4  and  in  the  test  statistics  based  on  Models 

5  and  6  when  the  alternative  was  the  lambda  distribution 

described  earlier.  These  results  indicate  a  bias  of  the 

test  statistics  toward  the  gamma  and  lambda  distributions. 

2 

Results  of  the  x^  distribution  were  unexpected.  For 
sample  size  40,  the  new  test  statistics  based  on  Models 
5  and  6  show  approximately  100  percent  improvement  in  power 
over  their  corresponding  classical  test  statistic. 

With  respect  to  the  goodness  of  fit  tests  proposed 
for  the  extreme  value  distribution  it  should  be  noted  that 
these  are  equivalent  to  tests  for  the  two  parameter  Weibull 
distribution  if  the  data  are  transformed  into  new  random 
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variables  =  -  In  where  {X^  i=l,...,n  is  the  sample 
to  be  compared  with  the  Weibull. 

Summary 

The  level  of  precision  which  we  were  able  to  attain 
in  distribution  and  density  function  estimation  laid  the 
foundation  for  extending  the  application  of  our  new  non- 
parametric  models  into  the  goodness  of  fit  arena.  After  a 
brief  survey  of  the  literature,  we  proposed  eight  new  test 
statistics,  six  based  on  adaptive  Models  5  and  6,  and  two 
of  the  modified  EDF  class.  The  generation  of  critical 
values  and  the  Monte  Carlo  mechanics  of  the  power  studies 
was  presented  for  goodness  of  fit  tests  for  the  normal  and 
extreme  value  distributions.  Appendices  3  and  4  contain 
much  of  the  tabular  results.  What  the  power  comparisons 
showed  was  that  tests  based  on  Models  5  and  6  were  competi¬ 
tive  when  the  null  distribution  was  normal,  and  competitive, 
if  not  superior,  when  the  null  distribution  was  the  extreme 
value.  The  magnitude  of  the  improvement  in  power  in  the 
extreme  value  tests  against  normal,  double  exponential, 
and  logistic  alternatives  strongly  suggests  that  these  new 
tests  are  superior  over  various  alternatives.  Tests  for 
the  two  parameter  Weibull  are  also  possible  since  they  are 
equivalent  with  tests  for  the  extreme  value  distribution. 

Thus  far,  we  have  been  successful  in  distribution 
and  density  estimation,  and  goodness  of  fit  testing. 
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The  next  chapter  will  venture  into  the  realm  of  parametric 
estimation  using  our  nonparametric  distribution  and 
density  function  models. 
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VI .  Location  Parameter  Estimation  for 
Symmetric  Distributions 

Introduction 

Given  a  random  sample  of  size  n  from  a  univariate 
continuous  probability  distribution,  we  have  already 
generated  non parametric  estimates  of  the  distribution, 
density,  and  hazard  functions  as  well  as  proposed  new  good¬ 
ness  of  fit  tests.  Rather  than  a  complete  distribution 
estimate,  one  nuy  wish  to  estimate  only  certain  character¬ 
istics  of  the  distribution.  While  the  nonparametric  pro¬ 
cedure  holds  promise  for  estimating  parameters  from  an 
assumed  model  in  general,  we  now  propose  to  examine  one 
specific  class  of  estimates,  namely  the  estimates  of  the 
location  parameter  of  a  symmetric  family  of  distributions. 
Our  treatment  begins  with  a  literature  overview  of  loca¬ 
tion  estimates  and  a  discussion  of  the  concept  of  robust¬ 
ness.  Many  of  the  estimators  identified  were  used  in  the 
celebrated  Princeton  robustness  study  (Ref  5  ) .  Because  of 
the  performance  of  the  new  nonparametric  models  in  approxi¬ 
mating  underlying  distributions,  it  was  conjectured  that 
estimators  based  on  the  models  might  exhibit  some  useful 
robust  characteristics  in  the  location  problem.  Based  on 
some  very  elementary  concepts  of  trimming  and  Winsorizing, 
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we  propose  some  48  new  estimators  of  the  location  parameter 
using  these  new  models.  Estimator  evaluation  is  accom¬ 
plished  in  terms  of  standardized  empirical  variances  deter¬ 
mined  from  a  Monte  Carlo  analysis  considering  samples  of 
size  20.  Comparisons  of  estimators  are  made  using  rela¬ 
tive  deficiencies,  both  average  and  maximum,  over  subsets 
of  nine  alternate  distributions.  A  large  number  of  pair¬ 
wise  comparisons  are  graphically  illustrated  via  deficiency 
plots.  Finally,  robustness  characteristics  are  evaluated 
in  the  form  of  stylized  sensitivity  curves.  The  judicious 
use  of  the  tables  and  figures  of  this  chapter  should  allow 
an  analyst  to  judge  which  estimator  is  appropriate  for  the 
alternative  distributions  he  may  expect.  We  include  twelve 
other  estimators  for  comparative  purposes. 

Historical  Survey 

Like  goodness  of  fit  tests,  parameter  estimation 
has  not  suffered  from  lack  of  attention  in  the  literature. 

In  this  section  we  will  briefly  examine  some  recent  studies 
which  bear  on  the  present  investigation.  We  will  limit 
our  discussion  to  location  parameter  estimates  of  a  sym¬ 
metric  distribution  and  considerations  of  robustness. 

The  concept  of  robustness  is  central  to  our  investi¬ 
gation.  Robustness,  as  defined  by  Hampel,  simply  means 
that  small  changes  in  the  assumed  underlying  model  should 
cause  only  a  small  change  in  the  performance  of  an 


estimator  (Ref  30) .  Excellent  surveys  of  the  development 
of  robust  techniques  are  given  by  Stigler,  Hogg,  and  Huber 
(Refs  38,  42,  91,  93)  . 

Computational  formulae  and  applications  for  common 
robust  estimates  are  given  by  Moore,  Hogg  and,  to  a  limited 
extent,  David  (Refs  19,  39,  60).  Some  specific  estimators 
deserve  mention,  particularly  the  "alphabet"  estimators. 
Huber  developed  M-estimators,  based  on  minimizing  a  function 
of  the  form  I  p (X^-T)  where  p  is  an  arbitrary  function. 
Specific  choices  of  p  result  in  the  estimator  T  being  the 
sample  mean,  sample  median,  or  a  maximum  likelihood  esti¬ 
mator  (Ref  41).  Hampel  introduced  a  family  of  piecewise 
linear  M-estimators  (Ref  5) .  Given  combinations  of  order 
statistics  form  a  general  class  known  as  L-estimators . 
Besides  trimmed  and  Winsorized  means,  this  class  includes 
estimators  given  by  Alam,  Harter,  Gastwirth  and  others 
(Refs  2,  26,  33) . 

A  recent  article  by  Chan  and  Rhodin  introduces 
asymptotically  best  linear  estimates  based  on  a  finite  num¬ 
ber  of  symmetrically  ranked  order  statistics.  These  esti¬ 
mates  are  shown  to  be  more  efficient  than  optimally  trimmed 
or  Winsorized  means  (Ref  12) .  Estimators  based  on  rank 
tests,  such  as  the  Hodges-Lehmann  estimator,  belong  to  the 
class  of  R-estimators  (Ref  37) .  More  recently,  a  family  of 
D-estimators  was  investigated  by  Parr  (Ref  61) .  Originally 
proposed  by  Wolfowitz,  a  D-estimator  minimizes  some 
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discrepancy  (such  as  the  CVM  distance)  between  the  empiri¬ 
cal  distribution  function  and  an  underlying  parametric 
family  (Ref  108)  .  Parr  and  Schucany  have  shown  that 
D-estimation  is  a  competitive  technique  in  estimating  the 
location  parameter  of  symmetric  distributions  by  using  the 
normal  distribution  as  a  projection  model  (Ref  63) . 
D-estimation  using  a  weighted  CVM  discrepancy  is  discussed 
by  Parr  and  DeWit  (Ref  64) .  Shaler  states  the  conditions 
for  existence  and  consistency  of  minimum  discrepancy  esti¬ 
mates  (Ref  79) .  Beran  proposes  and  evaluates  minimum 
Bellinger  distance  estimators  based  on  a  discrepancy  using 
a  density  function  estimate  and  the  underlying  density 
function  (Ref  6 ) .  The  relationship  between  these  types 
of  estimates  and  goodness  of  fit  tests  is  given  by 
Easterling  (Ref  22) .  For  an  exhaustive  bibliography  of 
minimum  distance  estimation,  refer  to  Parr  (Ref  62) . 

Various  adaptive  procedures  have  emerged.  Hogg 
lists  variations  of  estimators  based  on  kurtosis,  the 
statistic  and  percentile  ratios  (Ref  38) .  Harter  proposed 
a  variant  of  Hogg's  estimator  using  certain  maximum  likeli¬ 
hood  estimates  and  kurtosis  as  a  discriminant  (Ref  60) . 
Optimal  boundaries  for  various  discriminants  were  deter¬ 
mined  by  Rugg  (Ref  77) .  Numerous  other  studies  have  been 
conducted  using  discriminants  and  generalized  projection 
families  such  as  the  GEP  distribution  or  the  t  distribution. 
Adaptive  techniques  incorporating  both  classical  estimation 
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procedures  and  minimum  distance  constraints  have  recently 
been  investigated  (Refs  3,  11,  16,  17,  24,  32,  34,  43,  55). 

Perhaps  the  single  most  comprehensive  study  of  esti¬ 
mates  of  the  location  parameter  of  a  symmetric  distribution 
was  the  Princeton  study  (Ref  5).  While  analyzing  some  68 
estimators,  the  authors  are  quick  to  point  out  that  their 
study  is  not  exhaustive.  Stigler  presents  an  interesting 
comparison  of  some  of  the  estimators  used  in  the  Princeton 
study.  He  uses  24  original  data  sets  from  famous  experi¬ 
ments  conducted  in  the  18th  and  19th  century  to  determine 
the  parallax  of  the  sum,  the  mean  density  of  the  earth,  and 
the  velocity  of  light.  Both  his  comments,  while  quite  nega¬ 
tive  toward  a  large  set  of  new  robust  estimators,  and  the 
comments  of  various  discussants  provide  a  refreshing  discus¬ 
sion  of  the  use  of  robust  procedures  (Ref  92) . 

Proposed  New  Estimators 

The  construction  of  the  new  nonparametric  cum¬ 
ulative  and  density  estimators  implicitly  gives  us  a 
technique  for  parameter  estimation.  This  analysis  only 
attempts  to  begin  to  explore  the  various  procedures  for 
estimating  the  parameters  of  an  underlying  distribution. 

We  chose  the  family  of  symmetric  distributions  for  two 
reasons.  First,  estimates  of  the  location  parameter  can 
be  constructed  in  very  simple  forms  since  the  mean, 
median,  and  mode  of  the  density  are  identical. 
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Second,  comparisons  with  other  estimates  are  readily 
available . 

To  form  the  estimators  we  use  four  of  our  nonpara- 
metric  models — Models  2,  4,  5,  and  6.  The  means  and 
medians  of  the  models  comprise  the  first  eight  new  esti¬ 
mators.  The  means  were  calculated  using  a  modified 
Simpson's  Rule  integration  routine  and  the  medians  were 
found  by  inverting  the  distribution  function  estimate 
using  a  Newton-Raphson  technique.  Estimators  of  this 
type  are  identified  by  Mean-Mn,  Median-Mn,  etc.  where  Mn 
denotes  Model  n,  n=2,4,5,6. 

Two  other  families  of  estimators  were  formed. 
Modified  trimmed  means  were  calculated  by  symmetrically 
trimming  a  percentage  of  observations  from  each  end  of 
the  original  ordered  sample  and  then  calculating  the  sample 
mean  of  the  nonparametric  density  defined  by  the  remaining 
data  points  and  our  models.  Five  different  levels  of 
trimming  were  used.  The  estimators  are  designated  a  percent 
T-Mn  where  a  is  the  trimming  proportion,  a=5(5)25).  Modi¬ 
fied  Winsorized  means  were  calculated  based  on  the  density 
function  determined  by  the  entire  original  sample.  To 
calculate  the  modified  Winsorized  means,  let  a  be  the 
amount  (percentage)  of  Winsorizing.  Calculate  SF  1 (a) 
and  SF  ^(l-a)  where  SF  is  the  nonparametric  estimator  of 

the  distribution  function.  Then,  the  modified  Winsorized 

/\ 

mean,  xa,  is  given  by: 
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SF-1 (1-ct) 

x  =/  ,  xdSF(x)  +  a  (SF-1  (a)  +  SF'1(l-a)) 

a  SF_1(a) 


What  we  have  effectively  done  is  to  take  the  mean  of  a 
mixed  distribution  formed  by  truncating  the  nonparametric 
density  at  SF  1(a)  and  SF  ^(1-ot)  and  letting  these  two 
endpoints  have  a  finite  probability,  namely  a.  This  is 
analogous  to  the  Winsorized  mean  where  sample  points  are 
mapped  back  to  the  order  statistics  corresponding  to  the 
amount  of  Winsorizing.  Modified  Winsorized  means  are 
designated  by  a  percent  W-Mn  where  a  is  the  amount  of 
symmetric  Winsorizing,  a=5(5)25.  This  gives  us  a  total 
of  forty-eight  new  estimators  proposed. 

Estimator  Evaluation 

Using  the  Princeton  study  as  a  guide,  we  conducted  a 
limited  Monte  Carlo  analysis  of  three  estimators.  We  gene¬ 
rated  1000  Monte  Carlo  samples  of  size  20  from  nine  different 
distributions  including  the  normal,  double  exponential, 
Cauchy  and  six  contaminated  normals.  The  normal,  double 
exponential  and  Cauchy  distributions  all  had  a  zero  location 
parameter  and  a  unit  scale  parameter.  The  contaminated  nor¬ 
mals  consisted  of  e  percent  observations  from  a  normal  with 
zero  mean  and  a  scale  parameter  of  three  and  (1-e)  percent 
observations  from  a  standard  normal.  The  contamination  per¬ 
centages  used  were  5,  10,  15,  25,  50,  and  75.  These  distributions 
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will  be  designated  e  percent  3N  where  e  is  the  contamina¬ 
tion  percentage. 

The  distributions  were  grouped  into  classes  of 
alternatives  to  the  normal,  using  the  same  groupings  as 
the  Princeton  study.  The  gentle,  reasonable  alternatives 
include  the  normal  5%  3N,  10%  3N,  15%  3N  and  25%  3N. 
Gentle,  unreasonable  alternatives  include  50%  3N  and 
75%  3N.  Vigorous  alternatives  include  the  double  exponen¬ 
tial  and  the  Cauchy.  A  fourth  set  of  alternatives  con¬ 
sidered  was  the  set  of  all  distributions  tested  except 
the  Cauchy.  No  specific  short  tailed  distribution  was 
tested  in  this  portion  of  the  study.  The  groupings  relate 
to  how  the  analyst  views  the  practical  world  his  data 
comes  from.  Using  the  normal  distribution  as  a  model  of 
reality,  the  sampling  mechanism  and  underlying  process 
may  allow  for  only  mild  departures  from  normality.  In 
other  cases,  an  analyst  may  want  protection  against  a 
larger  deviation  in  his  underlying  view  of  the  world.  By 
generating  various  sets  of  alternatives,  we  may  infer  the 
conditions  under  which  certain  estimators  perform  better. 

For  each  random  sample  we  calculated  all  48  esti¬ 
mates.  For  comparison  purposes,  we  also  included  the 
sample  mean,  sample  median,  and  ten  M-estimators,  con¬ 
sisting  of  six  Hubers  and  four  Hampels.  The  Hubers 
includes  H20,  H17,  H15,  H12,  H10,  and  H07,  while  the 
Hampels  used  were  25A,  21A,  17A,  and  12A.  For  a  complete 


definition  of  these  estimators  and  their  associated  param¬ 
eters,  refer  to  the  Princeton  study  (Ref  5 ) .  Results  of 
this  Monte  Carlo  study  for  the  Hubers  and  Hampel s  are  in 
excellent  agreement  with  the  variances  given  in  that  same 
study . 

Table  VI .1  gives  the  standardized  empirical  vari¬ 
ances  for  all  sixty  estimators  used.  Table  entries  repre¬ 
sent  the  mean  square  error  of  the  estimate  multiplied  by 
the  sample  size.  Even  when  actual  variances  are  available, 
we  used  the  empirical  ones  to  compare  estimators  to  keep 
relative  rankings  consistent.  For  example,  the  true  vari¬ 
ance  of  the  sample  mean  is  1/n  for  an  underlying  normal 
population.  Thus  the  table  entry  should  be  1.000.  We, 
however,  will  use  our  empirical  variance  entry  of  0.990 
for  relative  comparisons. 

To  synthesize  this  information  into  meaningful  com¬ 
parisons,  we  introduce  the  concept  of  deficiencies.  The 
deficiency  of  an  estimator  is  akin  to  Hogg's  "insurance 
premium"  of  using  a  robust  estimate.  It  is  the  penalty 
you  pay  if  the  distributional  assumption,  you  chose  not  to 
make,  is  actually  correct.  Deficiencies  are  calculated  as 
follows:  Let  T^  be  an  estimator  of  type  i  over  a  set  of 
test  distributions  indexed  by  j .  Now  let  T  .  .be  the 
estimator  with  the  smallest  standardized  empirical  variance 
for  distribution  j. 
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STANDARDIZED  EMPIRICAL  VARIANCES  OF  THE  ESTIMATORS  FOR  SAMPLE  SIZE  20 
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variance  of  T  . 

Define  efficiency  (T.  .)  =  - ; - - — c  ^  . 

1  in  variance  of  T . . 

Then,  deficiency  =  1  -  efficiency.  Naturally,  one  prefers 
deficiencies  near  zero. 

For  each  set  of  alternatives  we  calculated  two 
measures  of  deficiency,  the  maximum  deficiency  of  an  esti¬ 
mator  for  all  distribution  is  the  class  and  the  average 
deficiency  over  the  class.  Again,  depending  on  the 
sampling  situation,  one  criterion  may  be  more  appropriate 
than  another.  An  analyst  faced  with  a  large  penalty  for 
poor  performance,  would  probably  prefer  the  maximum  rela¬ 
tive  efficiency  criterion. 

Tables  VI. 2  through  VI. 5  rank  each  of  the  60  esti¬ 
mators  with  respect  to  both  maximum  relative  and  average 
relative  deficiencies  under  each  different  set  of  alterna¬ 
tive  distributions.  Notice  in  particular,  the  excellent 
performance  of  the  new  estimators  under  gentle,  reasonable 
alternatives  and  under  all  alternatives  except  Cauchy 
(Tables  VI. 2  and  VI. 5).  Of  particular  note  is  the  fact 
that  only  one  modified  Winsorized  mean  is  among  the  20 
leading  estimators  under  either  relative  efficiency  cri¬ 
terion  for  any  set  of  alternatives.  This  estimator, 
25%W-M6,  is  clearly  the  best  of  the  modified  Winsorized 
estimators  that  was  proposed.  Under  gentle,  reasonable 
alternatives,  the  modified  trimmed  mean,  10%T-M2,  seems  to 
perform  "better"  than  the  other  estimators  for  either 
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ESTIMATORS  RANKED  BY  RELATIVE  DEFICIENCIES  UNDER  GENTLE,  REASONABLE  ALTERNATIVES 
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deficiency  criterion.  For  protection  against  vigorous 
alternatives  Hampel's  12A  seems  to  be  the  preferred  choice. 

As  expected,  no  one  estimator  clearly  surpassed 
the  field.  Depending  on  each  sampling  situation  and  the 
set  of  likely  alternatives,  the  choice  of  an  estimator 
is  largely  subject  to  analyst  discretion. 

Another  comparison  can  be  drawn  between  estimators 
or  families  of  estimators.  By  plotting  the  deficiency  of 
an  estimator  or  a  family  of  estimators  under  one  alterna¬ 
tive  distribution  versus  another  alternative,  we  get  a 
graphical  comparison  of  the  relative  performance  of  the 
estimators.  Such  deficiency  plots,  using  the  normal  as 
one  alternative  in  all  cases,  were  constructed  for  the 
double  exponential,  Cauchy,  and  the  contaminated  normals. 
Figures  6 . 1  through  6.16  compare  the  deficiencies  for  the 
medians  of  some  of  the  nonparametric  models,  the  modified 
Winsorized  estimator  25%W-M6,  the  family  of  Hubers,  the 
family  of  Hampels,  and  the  families  of  trimmed  means  for 
Models  2,  4,  5,  and  6.  For  each  specific  alternative  dis¬ 
tribution,  a  set  of  two  plots  were  generated  for  clarity. 
The  first  plot  shows  the  comparison  of  the  nonparametric 
medians  and  25%W-M6  with  the  Hubers  and  Hampels.  The 
medians  on  this  plot  are  designated  Mn  where  n  is  the  model 
number.  The  second  plot  shows  the  comparison  among  the 
four  families  of  trimmed  means  generated  from  Models  2,  4, 
5,  and  6.  Each  family  is  labeled  by  its  corresponding 
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Figure  6.2.  Deficiency  Plot  for  Trimmed  Means- 
Double  Exponential  vs  Normal 


OeFIClEHCr  PLOTS 


Figure  6.3.  Deficiency  Plot  for  Medians,  25%W-M6 ,  Hubers 
and  Hampel s — Cauchy  vs  Normal 
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Figure  6.5.  Deficiency  Plot  for  Medians,  25%W-M6,  Hubers 
and  Hampel s — 5%  3N  vs  Normal 


OEFtClCMCT  PLOTS 


Figure  6.6.  Deficiency  Plot  for  Trimmed  Means- 
5%  3N  vs  Normal 
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Figure  6.7.  Deficiency  Plot  for  Medians,  25%W-M6,  Hubers 
and  Hampel s — 10%  3N  vs  Normal 
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Figure  6.9.  Deficiency  Plot  for  Medians, 2 5%W-M6 ,  Hubers  and 
Hampels — 15%  3N  vs  Normal 
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Figure  6.10.  Deficiency  Plot  for  Trimmed  Means- 
15%  3N  vs  Normal 


OEFICIEHCT  PLOTS 


Figure  6.11.  Deficiency  Plot  for  Medians,  25%W-M6,  Hubers 
and  Hampels — 25%  3N  vs  Normal 
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Figure  6.12.  Deficiency  Plot  for  Trimmed  Means- 
25%  3N  vs  Normal 


OCFICIENCT  PLOTS 
NORH9L 


Figure  6.13.  Deficiency  Plot  for  Medians,  25%W-M6,  Hubers 
and  Hampels — 50%  3N  vs  Normal 
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Figure  6.14.  Deficiency  Plot  for  Trimmed  Means- 
50%  3N  vs  Normal 
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Figure  6.15.  Deficiency  Plot  for  Medians,  25%W-M6,  Hubers 
and  Hampels — 75%  3N  vs  Normal 
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Figure  6.16.  Deficiency  Plot  for  Trimmed  Means- 
75%  3N  vs  Normal 


model  number.  Since  the  modified  Winsorized  means  as 
families  and  the  means  of  the  nonparametric  models  did  not 
appear  to  be  competitive  estimators,  we  chose  not  to  include 
their  deficiency  plots.  We  also  chose  to  plot  only  the 
deficiency  comparisons  against  a  normal  world.  Based  on 
the  values  in  Table  VI. 1  other  deficiency  plots  could  be 
generated  for  any  pair  of  alternative  distributions. 

As  a  final  means  of  estimator  evaluation,  we  use 
a  tool  developed  by  Hampel--the  influence  curve.  Hampel 
describes  the  influence  curve  as  ".  .  .  essentially  the 
first  derivative  of  an  estimator,  viewed  as  a  functional, 
at  some  distribution.  .  ."  (Ref  31).  We  have  chosen  to 
approximate  the  influence  curves  for  the  finite  sample  case 
by  the  use  of  "stylized  sensitivity  curves,"  similar  to  the 
ones  used  in  the  Princeton  study.  These  stylized  sensi¬ 
tivity  curves  for  sample  size  20  were  generated  in  the  fol¬ 
lowing  manner.  Let  T(x)  be  a  location  parameter  estimator. 
Generate  a  stylized  sample  from  the  normal  distribution  by 
inverting  the  standard  normal  distribution  function  at  the 
median  ranks  for  a  sample  size  19.  To  these  19  stylized 
order  statistics  add  a  20th  point  at  regular  intervals 
across  the  real  line.  We  chose  201  such  data  points  at 
equally  spaced  intervals  on  [-3,3].  Calculate  the  esti¬ 
mator  T (x)  for  each  stylized  sample  of  size  20.  Plotting 
nT(x)  ,  where  n=20,  versus  x,  the  added  data  point,  gives 
us  our  estimated  influence  curve. 


Figures  VI. 17  through  VI. 23  show  the  stylized  sensitivity 
curves  for  some  of  the  more  competitive  estimators  deter¬ 
mined  by  the  relative  efficiency  criteria. 

Viewing  the  stylized  sensitivity  curve  as  a 
derivative  plot,  we  can  determine  how  our  estimators 
change  with  the  addition  of  a  new  data  point.  Consider 
the  curve  for  the  median  of  Model  4  in  Figure  6.17.  The 
discontinuity  at  x  1  +  2.4  is  due  to  the  adaptive  technique 
employed  in  the  model.  At  that  point,  the  percentile  ratio 
dictated  a  model  change.  The  other  adaptive  models  were 
not  similarly  effected  since  the  percentile  ratios  could 
not  be  low  enough  when  using  a  stylized  normal  sample. 
Unlike  the  influence  curve  for  the  sample  median  which 
becomes  constant  only  a  very  short  distance  from  zero,  the 
medians  based  on  the  nonparametric  distribution  models 
change  slower  as  the  added  data  point  proceeds  away  from 
zero.  The  sample  medium  curves  for  Models  4  and  5  were 
still  monotonically  increasing  in  absolute  value  as  data 
points  were  added  further  away  from  zero.  The  changes 
were  very  small  at  the  ends  of  the  interval  considered, 
and  were,  however,  decreasing  in  magnitude.  The  stylized 
sensitivity  curve  for  Model  6  became  constant  for  x  values 
outside  the  interval  [X^j  ,  X(17)^  w^ere  these  order  sta¬ 
tistics  are  now  based  on  the  stylized  sample  of  size  19. 
Curves  for  the  modified  trimmed  means  also  become  constant 
at  some  point  away  from  zero,  just  as  curves  for  simple 
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Figure  6.17.  Stylized  Sensitivity  Curve  for  Median-M4 
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Figure  6.18.  Stylized  Sensitivity  Curve  for  Median-M5 
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Figure  6.21.  Stylized  Sensitivity  Curve  for  20%T-M5 


SENSITIVITY  CURVE 


Figure  6.22.  Stylized  Sensitivity  Curve  for  15%T-M6 
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Figure  6.23.  Stylized  Sensitivity  Curve  for  2 5%W-M6 


trimmed  means  do.  This  constant  value  of  the  sensitivity 
curve  indicates  that  only  the  sign  of  the  added  data  point 
is  being  noticed  by  the  estimator.  The  actual  value  of 
the  additional  point  could  be  at  any  point  corresponding 
to  the  constant  value  of  the  curve.  The  "influence"  on 
the  estimator  of  two  such  points  is  thus  identical.  If 
an  influence  curve  goes  to  zero,  the  estimator  totally 
rejects  the  added  data  point.  For  our  purposes,  the  value 
at  which  the  influence  curve  initially  becomes  zero  is 
termed  the  rejection  point.  Only  the  Hampel s  considered 
in  this  study  have  a  finite  rejection  point.  No  nonpara- 
metric  estimator  proposed  completely  rejects  outliers. 

Returning  to  Figure  6.17,  another  type  of  "influ¬ 
ence"  can  be  seen.  When  the  adaptive  procedure  comes  into 
play,  it  lessens  the  effect  on  the  estimator.  Thus,  a 
data  point  added  to  the  sample  at  x=2.8  has  a  smaller 
effect  on  the  median  using  Model  4  than  a  data  point  added 
at  x=2.3. 

The  influence  curve  also  allows  for  various  other 
measures  of  robustness.  One  such  measure  is  gross  error 
sensitivity,  the  worst  influence  an  outlier  can  cause.  We 
approximate  gross  error  sensitivity  by  the  absolute  value 
of  the  supremum  of  the  stylized  sensitivity  curve.  Of 
the  new  estimators  proposed,  the  one  with  the  smallest 
approximate  gross  error  sensitivity  was  the  median  for 
Model  6,  with  a  value  of  1.37.  When  compared  with  the 
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estimators  evaluated  by  Hampel,  only  the  sample  median 
possesses  a  smaller  gross  error  sensitivity  at  the 
standard  normal  distribution  (Ref  31) .  For  other  measures 
of  robustness,  such  as  local  shift  sensitivity,  asymptotic 
variance,  and  breakdown  points,  the  reader  is  referred  to 
Hampel's  article. 

Summary 

This  chapter  has  addressed  one  specific  problem 
in  parametric  estimation,  namely  estimating  the  location 
parameter  of  a  symmetric  distribution.  We  began  by  review¬ 
ing  some  of  the  literature  available  concerning  robustness 
aspects  of  the  problem  and  various  proposals  for  esti¬ 
mators.  Besides  M,  L,  R,  and  D  estimators,  adaptive  tech¬ 
niques  were  also  reviewed.  Next  we  proposed  some  48  new 
estimators  based  on  the  new  nonparametric  models.  Model 
means  and  medians  as  well  as  modified  trimmed  and  modified 
Winsorized  means  were  defined.  These  48  estimators  were 
then  evaluated  along  with  the  sample  mean,  sample  median 
and  estimators  previously  proposed  by  Huber  and  Hampel. 

A  Monte  Carlo  analysis  generated  a  standardized  empirical 
variance  for  each  estimator  under  nine  alternative  distri¬ 
butions.  A  relative  deficiency  comparison  was  then  made 
over  four  classes  of  alternative  distributions.  Under 
mild  deviations  from  the  normal  distribution,  new  non¬ 
parametric  estimators  possessed  smaller  average  relative 
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deficiency  or  smaller  maximum  relative  deficiency  than  the 
Hubers  or  Hampels.  Estimators  and  estimator  families  were 
further  compared  via  deficiency  plots  using  alternatives 
to  the  normal  distribution.  For  some  of  the  better  esti¬ 
mators,  approximate  influence  curves  were  presented. 
Robustness  considerations  using  these  stylized  sensitivity 
curves  showed  that  some  of  the  new  estimators  are  certainly 
competitive  and  robust. 
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VII .  Summary,  Applications,  Limitations 
and  Improvements 


Summary 

Motivated  by  the  dominance  of  the  empirical  dis¬ 
tribution  function  in  practically  every  area  of  statis¬ 
tical  inference,  this  research  effort  investigated  an 
alternative  to  the  EDF .  After  initially  examining  some 
other  sample  distribution  functions  and  related  plotting 
positions,  we  proposed  a  new  nonparametric  family  of  con¬ 
tinuous,  differentiable,  sample  distribution  functions. 

We  showed  that  members  of  this  family  possessed  the  proper¬ 
ties  of  a  distribution  function  and  also  converged  uni¬ 
formly  to  the  underlying  distribution.  Six  specific  mem¬ 
bers  of  the  family  were  chosen  as  models  for  the  rest  of 
the  analysis.  The  new  models  were  evaluated  in  three  dis¬ 
tinct  areas — their  ability  to  model  probability  distribu¬ 
tion  and  density  functions,  their  use  as  bases  for  goodness 
of  fit  tests,  and  their  use  in  estimating  the  location 
parameter  of  symmetric  distributions.  We  compared  the  dis¬ 
tribution  function  estimates  with  the  EDF  using  mean  inte¬ 
grated  square  error  as  the  criterion.  A  limited  Monte 
Carlo  analysis  indicated  that  the  new  models  were  superior 
to  the  EDF  for  most  of  the  distributions  tested.  The  deriva¬ 
tives  of  the  nonparametric  distribution  functions  were 
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also  evaluated  against  specifically  designed  density  esti¬ 
mates  under  the  same  error  criterion.  These  new  nonpara- 
metric  models  were  shown  to  be  competitive  with  or  superior 
to  other  continuous  density  estimates.  Eight  new  goodness 
of  fit  statistics  were  generated  from  the  new  models.  An 
extensive  Monte  Carlo  analysis  confirmed  that  the  new 
goodness  of  fit  tests  for  the  normal  and  extreme  value  dis¬ 
tributions  had  comparable  or  greater  power  than  the  most 
powerful  established  tests.  Forty-eight  new  estimators 
for  the  center  of  symmetric  of  a  symmetric  population  were 
proposed  based  on  the  new  models  using  modified  trimmed 
and  Winsorized  means.  For  relatively  mild  variations  of  the 
normal  distribution  certain  new  nonparametric  estimators 
were  shown  to  have  smaller  standardized  empirical  vari¬ 
ances  than  other  robust  estimators. 

The  overall  performance  of  the  six  models  tested 
has  been  impressive.  Using  the  relatively  simple  concept 
of  plotting  positions  and  adding  elementary  properties  of 
continuity  and  differentiability,  we  generated  a  very  power¬ 
ful  tool  for  data  analysis.  Several  applications  of  these 
models  in  problems  of  statistical  inference  are  now  sug¬ 
gested. 

Applications 

Given  a  random  sample,  our  new  nonparametric  models 
can  be  used  as  representations  of  the  distribution,  density. 
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and  hazard  functions  of  the  underlying  process  without 
making  any  distributional  assumption.  The  continuity  of 
the  functions  allows  for  easy  graphical  depiction.  Infer¬ 
ences  about  the  underlying  random  variable  can  be  made 
directly . 

The  new  models  can  also  serve  as  a  discriminant 
for  picking  a  parametric  model.  Having  three  continuous 
functions  (distribution,  density  and  hazard  functions) 
to  compare  against  selected  parametric  alternatives,  one 
could  choose  a  parametric  model  which  had  the  same  general 
characteristics  as  the  nonparametric  estimates.  Initially, 
this  could  be  done  by  graphical  means,  but  goodness  of 
fit  criteria,  using  various  distance  measures,  could  pro¬ 
vide  a  very  powerful  model  discriminant.  The  modified 
distance  measures  of  Appendix  1  allow  for  comparisons 
using  different  parametric  models  over  the  same  finite 
support  and  the  same  probability  measure. 

Closely  related  to  model  discrimination  is  the 
problem  of  parametric  estimation.  Beginning  with  an 
assumed  parametric  family,  parameter  estimates  are  made 
using  a  modified  distance  measure.  The  parametric  family 
is  changed  and  the  process  repeated  for  each  alternative 
family.  The  selection  of  the  parametric  model  is  then 
based  on  the  smallest  value  of  the  distance  criterion. 

The  advantage  of  this  technique  is  that  both  model  dis¬ 
crimination  and  parametric  estimation  are  performed 
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simultaneously.  A  similar  approach  to  the  dual  problem 
of  model  discrimination  and  parameter  estimation  was  sug¬ 
gested  by  Borth,  who  used  entropy  as  a  criterion  (Ref  9  ) . 
Another  proponent  of  this  approach  is  Easterling  who 
attacks  parameter  estimation  problems  by  inverting  good¬ 
ness  of  fit  tests  (Ref  22) .  This  is  precisely  what  the 
above  approach  does  with  respect  to  the  modified  distance 
measures . 

Another  specific  example  of  the  use  of  the  new 
nonparametric  models  is  in  the  field  of  reliability.  Due 
to  high  cost  or  destructive  experiments,  the  reliability 
engineer  is  frequently  faced  with  sparse  data  sets  and  the 
need  for  a  tool  of  statistical  inference.  Our  new  models 
provide  the  capability  of  making  reliability  estimates 
from  small  data  sets  without  the  distribution  assumptions 
usually  made  in  reliability  analysis.  The  goodness  of  fit 
test  results  for  two  widely  used  models  in  life  testing, 
the  normal  and  the  extreme  value,  and  the  ability  to  esti¬ 
mate  the  hazard  function  by  a  continuous  model  indicate 
the  applicability  of  the  new  nonparametric  procedures  to 
reliability  problems.  The  continuity  of  the  sample  hazard 
function  also  creates  the  possibility  of  goodness  of  fit 
tests  based  on  some  distance  measure  between  hazard  func¬ 
tions.  Tests  using  hazard  functions  have  recently  been 
proposed  by  Kochar  (Refs  46,  4  7).  While  these  tests  are 
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for  the  two  sample  problem,  the  new  nonparametric  models 
may  provide  a  basis  for  a  one  sample  test. 

The  new  models  also  hold  promise  for  use  in  simula¬ 
tion  studies.  Typically,  Monte  Carlo  simulation  is  per¬ 
formed  when  the  distribution  of  the  dependent  random  vari¬ 
able  is  unknown  .  By  taking  a  smaller  Monte  Carlo  sample, 
the  distribution  of  the  dependent  variable  can  be  esti¬ 
mated  nonparametrically .  While  no  specific  results  are 
available  to  date,  the  potential  benefits  of  reductions  of 
Monte  Carlo  sample  size  warrant  investigation.  Such  a  tech 
nique  could  be  used  in  large  scale  simulations  such  as 
cost  analysis. 

While  all  of  the  applications  considered  thus  far 
dealt  with  complete  random  samples,  the  nonparametric  tech¬ 
niques  are  also  capable  of  modeling  other  types  of  data 
sets.  Grouped  data  is  easily  handled,  providing  that  the 
maximum  number  of  data  points  in  one  group  is  at  least  as 
small  as  the  number  of  subsamples  used  in  the  model.  If 
not,  small  offset  values  can  be  introduced  to  insure  that 
no  subsample  has  two  identical  points .  The  generation  of 
the  nonparametric  models  from  a  grouped  data  set  is  identi¬ 
cal  to  that  of  an  ungrouped  random  sample.  As  such,  we 
can  get  a  continuous  distribution  function  estimate  and 
construct  goodness  of  fit  tests  for  grouped  data  in  exactly 
the  same  manner  as  we  constructed  the  tests  in  Chapter  V. 


Limitations 


While  extremely  flexible,  the  new  nonparametric 
models  are  subject  to  certain  limitations.  In  the  theo¬ 
retical  development,  we  arbitrarily  set  the  derivative  of 
the  nonparametric  distribution  function  equal  to  zero  at 
each  data  point  to  insure  differentiability.  A  consequence 
of  this  step  is  that  lim  sf(x)  and  lim  sf(x)  exist 


and  are  equal  to  zero.  Obviously  some  density  functions  do 
not  exhibit  these  same  properties,  for  example,  the  uniform, 
the  exponential  or  a  U-shaped  beta.  All  of  the  nonpara¬ 
metric  estimates  have  density  functions  whose  value  is 
zero  at  the  endpoints  of  their  finite  support.  The  fixed 
endpoint  modifications  introduced  in  the  adaptive  models 
attempt  to  minimize  the  effect  of  discontinuities  of  the 
underlying  density  functions.  The  nonparametric  density 
estimates  are  continuous  over  R'S  in  general,  density  func¬ 
tions  are  not. 


Only  unimodal  densities  were  examined  in  the  pre¬ 
ceding  chapters.  A  limited  analysis  was  done  on  a  bimodal 
distribution,  the  double  triangular.  The  results  indicated 
that,  while  bimodality  may  be  inferred,  the  density  estimate 
tended  to  attach  unnecessary  weight  to  the  interval  between 
the  modes.  A  further  analysis  is  necessary  to  determine 
the  extent  of  this  limitation. 
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Finally,  the  sinusoidal  oscillation  of  the  non- 


parametric  estimates  may  be  undesirable  to  some  analysts. 
While  not  as  smooth  as  the  orthogonal  series  estimates, 
the  new  estimates  do  possess  the  distribution  function 
properties  lacking  in  the  others.  In  all  of  the  cases 
considered  in  this  analysis,  the  smoothing  procedure  used 
tended  to  prevent  radical  motions  in  both  the  distribution 
and  density  functions. 

Improvements 

In  examining  our  nonparametric  models  we  chose 
only  a  representative  few  members  of  the  family  which 
showed  good  performance.  We  also  limited  ourselves  to 
small  sets  of  initial  variables  for  the  estimators.  While 
we  attempted  to  justify  all  of  our  choices  are  reasonable, 
we  examined  only  a  very  small  set  of  possible  variables. 
The  following  are  suggested  as  an  initial  list  of  possible 
improvements  to  the  method.  First,  other  variable  sets 
for  plotting  positions,  inversion  points,  etc.,  need  to  be 
explored.  Their  evaluation  should  still  depend  on  a  dis¬ 
tance  measure  criterion,  for  both  the  distribution  and 
density  functions,  perhaps  some  linear  combination  of  both 
Second,  alternatives  to  the  percentile  ratios  need  to  be 
considered  as  discriminants.  Third,  other  functions 
besides  the  trigonometric  ones  need  to  be  evaluated  for 
forming  the  continuous,  differentiable  models.  Some 


functions  to  consider  are  probability  distribution  func¬ 
tions,  themselves;  an  analytic  function  with  non- zero 
derivative  at  the  endpoints  which  could  be  pieced  together 
to  form  the  sample  distribution  function  would  be  ideal. 
Finally,  modification  of  the  technique  to  model  censored 
samples  would  be  an  important  contribution  in  reliability 
and  life  testing. 

Our  investigation  of  nonparametric,  continuous, 
differentiable,  sample  distribution  functions  has  covered 
a  large  area  of  statistical  inference,  from  distribution 
and  density  estimation,  to  goodness  of  fit,  to  parameter 
estimation.  Our  models  have  shown  some  significant 
results,  particularly  at  small  sample  sizes.  Further 
refinements  of  techniques  based  on  continuous  sample  dis¬ 
tribution  functions  can  further  advance  the  field  of  sta¬ 


tistical  inference. 
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Appendix  1_ 

Modified  Distance  Measures 

A  classical  distance  measure  with  respect  to  an 
integral  criterion  is  given  by: 

00 

6(F,G)  =/  (F(x)-G(x))2  iMF(x))  dF(x) 

—  OO 

where  <MF(x))  is  some  preassigned  weight  function  (Ref  78). 
For  the  Cramer  von  Mises  distance,  G(x)  is  the  empirical 
distribution  function,  Sn(x),  tp(F(x))=l,  and  F(x)  is  the 
postulated  underlying  model.  Thus  6(F,Sn>  is  a  CVM  dis¬ 
tance  measure. 

Given  a  measure,  VFn  whose  corresponding  probabil¬ 
ity  distribution  function  F^  is  measurable,  we  can  now  con¬ 
sider  an  alternative  distance  measure,  6(f  ,F) .  Since 

n 

SF(x),  as  defined  in  equation  3.6,  is  continuous  and  dif¬ 
ferentiable,  we  can  define: 

X 

/max  „ 

(SF(x)-F(x))  <MSF(x))  dSF(x) 

Xmin 

In  the  classical  case,  for  ^(F(x))=l,  S(F,G)  is  the 
integrated  square  error  with  a  weight  of  f  induced  by  the 
dF(x)  term.  Using  Sn  as  an  approximation  to  F  so  that 
dSn(x)  approximates  f(x)  dx  results  in  6(F,G)~ 
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/  (F<x)-G(x>) 2 dSn(x) , 

—oo 

which  is  the  average  square  error  between  the  distribution 
functions  F  and  G  (Ref  105) .  Since  F  is  approximated  by 
SF,  we  can  also  approximate  the  integrated  square  error 
<5  (F ,  SF)  by  6  (SF,F)  ,  where  <MSF(x))=l. 

The  following  are  some  classical  and  modified 
distance  measures  used  in  the  analysis  where  F  is  the  under¬ 
lying  distribution  function  and  SF  is  the  continuous  dif¬ 
ferentiable  sample  distribution  function.  Each  distance 
measure  is  listed  only  with  respect  to  closeness  of  the 
distribution  functions  F  and  SF.  Substitution  of  f  and  sf 
for  F  and  SF  respectively  in  only  the  absolute  value  or 
squared  terms  gives  the  corresponding  distance  measure  for 
the  density  functions.  Note  that  the  argument  of  both  the 
weight  function  <p  and  differentiation  operator  D  is  still 
the  distribution  function,  not  the  density  function. 

1.  Kolmogorov-Smirnov  (KS)  distance 

6 (F,SF)  =  sup  |  F(x)  -  SF (x) | 

-oo<  x<°° 

approximated  by  max  |  F (X . ) -SF (X  . )  | 

i  1  1 

2.  KS  integral  distance 

00 

6  (F, SF)  =  f  |  F(x)  -SF(x)  j  dF(x) 
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3.  Modified  KS  integral  distance 


00 

<5  (SF, F)  =  /  |  SF  (x)  -F (x)  |  dSF  (x) 

—00 


4.  Cramer  von  Mises  (CVM)  integral  distance 


6  (F,  SF)  =  /  (F  (x)  -SF  (x)  )  2  dF  (x) 

— OO 


5.  Modified  CVM  integral  distance 


OO 

6(SF,F)  =/  (SF (x) -F (x) ) 2  dSF(x) 

— OO 


6.  Anderson  Darling  (AD)  integral  distance 


<5(F,SF)  =  /  (F(x) -SF(x)  )  2/ [F(x)  (1-F(x 

—00 


7.  Modified  AD  integral  distance 


6 (SF ,F) 


(SF  (x)  -F  (x)  )  /  [  (SF  (x)  (1-S: 


8 .  Average  square  error 

ASE  =;  I  (F(X.)-SF(X.) ) 2 
n  ,  i  i 


)]  dF (x) 


(x)  )  ]  dSF(x) 
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Generalized  Exponential  Power  (GEP)  Distribution 


The  Generalized  Exponential  Power  distribution  is  a 
three  parameter  family  of  symmetric  distributions  whose  tail 
length  ranges  from  extremely  platykurtic  to  extremely 
leptokurtic  (Ref  60).  While,  in  general,  the  distribution 
function  does  not  exist  in  closed  form,  the  density  func¬ 
tion  depends  on  p,  a,  and  p,  location,  scale,  and  shape 
parameters  respectively. 

j-  [**»M]Pj 

where 

»<p>  -  [»]” 

and  “"°°< x<°° /  -oo<jj<oor  0<o<°°,  l<p<°° 

2 

For  this  distribution,  E(X)=y  and  Var(X)=o  . 

Three  special  cases  occur  for  specific  choices  of 
the  shape  parameter  p: 

1.  p=l  reduces  the  GEP  distribution  to  the  Laplace 
or  double  exponential  distribution. 

2.  p=2  reduces  the  GEP  distribution  to  the  normal 


distribution. 


3.  As  p+°°,  the  GEP  distribution  approaches  the 
uniform  distribution.  Although  p-*°°  is  a  limiting  case, 
we  include  the  uniform  distribution  to  complete  the  family. 
To  avoid  the  limit  argument  in  discussions,  we  will  con¬ 
sider  p=°°  to  represent  the  uniform  distribution. 
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Appendix  2 
Critical  Values 

Tables  A3.1  through  A3. 10  list  the  critical  values 
of  the  eight  new  test  statistics — D5,  D6,  DMR,  W5,  W6 ,  WMR, 
A5,  and  A6 .  Two  null  hypothesis  situations  are  considered: 
(1)  the  null  distribution  completely  specified,  and  (2)  the 
null  distribution  parameters  estimated.  For  the  normal 
distribution,  the  parameters  were  estimated  using  the  uni¬ 
formly  minimum  variance  unbiased  estimates  X  and  S.  For 
the  extreme  value  distribution,  the  parameters  were  esti¬ 
mated  using  the  maximum  likelihood  method.  A  Newton  Raphson 
iteration  scheme  was  employed.  Critical  values  for  the 
normal  distribution  are  listed  in  Tables  A3 . 1  through  A3. 5. 
Critical  values  for  the  extreme  value  distribution  are 
listed  in  Tables  A3. 6  through  A3. 10.  Values  are  given  for 
sample  sizes  10(10)50  and  alpha  levels  .20,  .15,  .10,  .05, 
.025,  and  .01. 
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TABLE  A3.1 


CRITICAL  VALUES— NORMAL  DISTRIBUTION  — 
SAMPLE  SIZE  10 


Statistic 

Null  Distribution  Completely  Specified 

Alpha  Level 

.20 

.15 

.10 

.05 

.025 

.01 

D5 

.2249 

.2436 

.2739 

.3147 

.3503 

.3914 

D6 

.2238 

.2439 

.2712 

.3108 

.3487 

.3903 

DMR 

.2656 

.2846 

.3114 

.3509 

.3853 

.4192 

W5 

.2236 

.2667 

.3429 

.4114 

.5578 

.7164 

W6 

.2090 

.2549 

.3178 

.4243 

.5218 

.6767 

WMR 

.2239 

.2622 

.3240 

.4258 

.5106 

.6509 

A5 

1.997 

2.451 

3.082 

4.416 

5.631 

7.669 

A6 

1.812 

2.193 

2.806 

4.013 

5.370 

7.306 

Null  Distribution  Parameters  Estimated 

Alpha  Level 


Statistic 

.20 

.15 

.10 

.05 

.025 

.01 

D5 

.08559 

.09379 

.1045 

.1202 

.1342 

.1519 

D6 

.0961 

.1042 

.1147 

.1303 

.1455 

.1605 

DMR 

.1622 

.1721 

.1855 

.2042 

.2188 

.2374 

W5 

.02626 

.03120 

.03801 

.05103 

. 06626 

.08648 

W6 

.02866 

.03469 

.04270 

.05676 

.06899 

.09081 

WMR 

.07258 

.07960 

.09003 

.1075 

.1214 

.1478 

A5 

.3596 

.4414 

.5551 

.7616 

1.024 

1.312 

A6 

.3700 

.4482 

.5782 

.7959 

1.069 

1.353 
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TABLE  A3. 2 

CRITICAL  VALUES — NORMAL  DISTRIBUTION-- 
S AMPLE  SIZE  20 


Null  Distribution  Completely  Specified 


Alpha  Level 


Statistic 

.20 

.15 

.10 

.05 

.025 

.01 

D5 

.1521 

.1666 

.1885 

.2160 

.2354 

.2685 

D6 

.1572 

.1725 

.1927 

.2228 

.2428 

.2712 

DMR 

.2034 

.2177 

.2373 

.2687 

.2922 

.3205 

W5 

.2018 

.2491 

.3199 

.4267 

.5299 

.6916 

W6 

.2024 

.2509 

.3200 

.4271 

.5316 

.6788 

WMR 

.2314 

.2749 

.34  4  5 

.4550 

.5551 

.6838 

A5 

1.447 

1.755 

2.183 

2.907 

3.791 

5.325 

A6 

1.435 

1.760 

2.168 

2.837 

3.809 

5.157 

Null  Distribution  Parameters  Estimated 


Alpha  Level 


Statistic 

o 

CN 

• 

.15 

.10 

.05 

.025 

.01 

D5 

.05548 

.06104 

.06730 

.07698 

.08629 

.09618 

D6 

.07071 

.07698 

.08498 

.09649 

.1083 

.1204 

DMR 

.1335 

.1409 

.1510 

.1646 

.1754 

.1921 

W5 

.02286 

.02728 

.03373 

.04573 

.05793 

.07241 

W6 

.03240 

.03866 

.04739 

.06295 

.07948 

.09941 

WNR 

.07858 

.08654 

.09843 

.1212 

.1396 

.1662 

A5 

.2057 

.2477 

.3187 

.4829 

.6855 

.9754 

A6 

.2656 

.3250 

.4123 

.6126 

.8104 

1.112 
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TABLE  A3. 3 

CRITICAL  VALUES— NORMAL  DISTRIBUTION- 
SAMPLE  SIZE  30 


Null  Distribution  Completely  Specified 


Alpha  Level 


Statistic 

.20 

.15 

.10 

.05 

.025 

.01 

D5 

.1252 

.1368 

.1521 

.1738 

.1940 

.2195 

D6 

.1281 

.1390 

.1540 

.1765 

.1962 

.2232 

DMR 

.1717 

.1835 

.1992 

.2211 

.2407 

.2661 

W5 

.1970 

.2421 

.3007 

.4067 

.5189 

.6636 

W6 

.1982 

.2428 

.3015 

.4068 

.5243 

.6624 

WMR 

.2365 

.2757 

.3371 

.4365 

.5554 

.7058 

A5 

1.281 

1.530 

1.928 

2.556 

3.456 

4.562 

A6 

1.277 

1.534 

1.903 

2.563 

3.396 

4.517 

Null  Distribution  Parameters  Estimated 


Alpha  Level 


Statistic 

.20 

.15 

.10 

.05 

.025 

.01 

D5 

.05076 

.05525 

.06136 

.07162 

.08047 

.08940 

D6 

.05670 

.06168 

.06866 

.07950 

.08895 

.09939 

DMR 

.1130 

.1194 

.1275 

.1414 

.1520 

.1659 

W5 

.02544 

.03011 

.03764 

.05045 

.06426 

.08333 

W6 

.03025 

.03560 

.04392 

.05904 

.07528 

.09601 

WMR 

.07743 

.08660 

.09949 

.1208 

.1415 

.1699 

A5 

.1816 

.2198 

.2747 

.3948 

.5619 

.7823 

A6 

.2102 

.2534 

.3162 

.4595 

.6245 

.8625 
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TABLE  A3.4 

CRITICAL  VALUES --NORMAL  DISTRIBUTION — 
SAMPLE  SIZE  40 


Null  Distribution  Canpletely  Specified 


Alpha  Level 


Statistic 

.20 

.15 

.10 

.05 

.025 

.01 

D5 

.1066 

.1162 

.1289 

.1511 

.1709 

.1916 

D6 

.1100 

.1194 

.1314 

.1528 

.1726 

.1948 

DMR 

.1517 

.1619 

.1752 

.1963 

.2161 

.2380 

W5 

.1957 

.2234 

.2915 

.4101 

.5133 

.7017 

W6 

.1992 

.2370 

.2942 

.4137 

.5198 

.7071 

WNR 

.2388 

.2800 

.3354 

.4610 

.5670 

.7371 

A5 

1.159 

1.390 

1.723 

2.367 

3.176 

4.183 

A6 

1.188 

1.421 

1.744 

2.388 

3.193 

4.154 

Null  Distribution  Parameters  Estimated 


Alpha  Level 


Statistic 

.20 

.15 

.10 

.05 

.025 

.01 

D5 

.04336 

.04753 

.05264 

.06066 

.06760 

.07505 

D6 

.04942 

.05352 

.05936 

.06798 

.07591 

.08455 

DMR 

.1016 

.1075 

.1134 

.1239 

.1346 

.1456 

W5 

.02434 

.02861 

.03571 

.04881 

.06033 

.07552 

W6 

.02997 

.03510 

.04333 

.05841 

.07208 

.08959 

W® 

.07907 

.08729 

.09978 

.1211 

.1433 

.1654 

A5 

.1619 

.1902 

.2424 

.3364 

.4312 

.5763 

A6 

.1964 

.2309 

.2864 

.3942 

.5003 

.5480 
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TABLE  A3. 5 


CRITICAL  VALUES — NORMAL  DISTRIBUTION-- 
SAMPLE  SIZE  50 


Null  Distribution  Ccnpletely  Specified 


Alpha  Level 


Statistic 

.20 

.15 

.10 

.05 

.025 

.01 

D5 

.09375 

.1139 

.1324 

.1491 

.1657 

D6 

.09685 

.1054 

.1167 

.1349 

.1516 

.1692 

DMR 

.1363 

.1456 

.1583 

.1748 

.1926 

.2129 

W5 

.1848 

.2215 

.2847 

.3998 

.4935 

.6352 

W6 

.1903 

.2287 

.2921 

.4070 

.5046 

.6440 

WMR 

.2325 

.3305 

.4510 

.5541 

AS 

1.075 

1.267 

1.624 

2.173 

2.748 

3.598 

A6 

1.112 

1.319 

1.659 

2.218 

2.784 

3.619 

Null  Distribution  Parameters  Estimated 


TABLE  A3. 6 


CRITICAL  VALUES --EXTREME  VALUE  DISTRIBUTION — 
SAMPLE  SIZE  10 


Statistic 

Null  Distribution  Corrpletely  Specified 

Alpha  Level 

.20 

.15 

.10 

.05 

.025 

.01 

D5 

.2318 

.2534 

.2808 

.3256 

.3656 

.4104 

D6 

.2269 

.2503 

.2769 

.3205 

.3579 

.4057 

DMR 

.2660 

.2873 

.3108 

.3536 

.3891 

.4384 

W5 

.2401 

.2868 

.3559 

.4802 

.6194 

.8060 

W6 

.2193 

.2655 

.3270 

.4444 

.5766 

.7443 

WMR 

.2258 

.2640 

.3277 

.4284 

.5502 

.7121 

A5 

2.060 

2.578 

3.269 

4.516 

6.049 

8.173 

A6 

1.864 

2.308 

2.970 

4.104 

5.680 

8.139 

Null  Distribution  Parameters  Estimated 

Alpha  Level 


Statistic 

.20 

.15 

.10 

.05 

.025 

.01 

D5 

.08819 

.09628 

.1064 

.1234 

.1382 

.1589 

D6 

.09683 

.1052 

.1162 

.1316 

.1446 

.1646 

DMR 

.1646 

.1739 

.1867 

.2069 

.2247 

.2471 

W5 

.03060 

.03724 

.04607 

.06375 

.08351 

.1066 

W6 

.03277 

.03936 

.04948 

.06446 

.08231 

.1068 

WMR 

.07576 

.08359 

.09478 

.1124 

.1320 

.1544 

A5 

.3451 

.4313 

.5539 

.7675 

.9640 

1.344 

A6 

.3586 

.4367 

.5500 

.7644 

1.010 

1.340 
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TABLE  A3. 7 

CRITICAL  VALUES— EXTREME  VALUE  DISTRIBUTION- 
SAMPLE  SIZE  20 


Statistic 

Null  Distribution  Completely  Specified 

Alpha  Level 

.20 

.15 

.10 

.05 

.025 

.01 

D5 

.1552 

.1710 

.1899 

.2183 

.2456 

.2737 

D6 

.1585 

.1733 

.1911 

.2211 

.2489 

.2760 

DMR 

.2048 

.2183 

.2356 

.2661 

.2911 

.3183 

W5 

.2122 

.2627 

.3331 

.4530 

.5681 

.7441 

W6 

.2061 

.2516 

.3201 

.4363 

.5523 

.7129 

WMR 

.2336 

.2722 

.3316 

.4491 

.5514 

.7138 

A5 

1.495 

1.811 

2.265 

3.111 

4.112 

5.772 

A6 

1.465 

1.767 

2.202 

3.014 

4.056 

5.731 

Null  Distribution  Parameters  Estimated 

Alpha  Level 


Statistic 

o 

fNJ 

* 

.15 

.10 

.05 

.025 

.01 

D5 

•061170 

.06642 

.07342 

.08512 

.09431 

.1078 

D6 

.06939 

.07652 

.08366 

.09587 

.1076 

.1201 

DMR 

.1313 

.1385 

.1476 

.1627 

.1781 

.1946 

W5 

.02757 

.03302 

.04118 

.05543 

.07108 

.09624 

W6 

.03237 

.03841 

.04727 

.06333 

.08083 

.1098 

VMR 

.07786 

.08604 

.09769 

.1182 

.1411 

.1690 

A5 

.2189 

.2724 

.36  23 

.5310 

.7965 

1.185 

A6 

.2478 

.3004 

.3953 

.5806 

.8457 

1.244 
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TABLE  A3. 8 


CRITICAL  VALUES— EXTREME  VALUE  DISTRIBUTION- 
SAMPLE  SIZE  30 


Statistic 

Null  Distribution  Completely  Specified 

Alpha  Level 

• 

to 

o 

.15 

.10 

.05 

.025 

.01 

D5 

.1245 

.1360 

.1512 

.1751 

.1958 

.2205 

D6 

.1261 

.1375 

.1524 

.1764 

.1965 

.2226 

DMR 

.1697 

.1818 

.1992 

.2221 

.2411 

.2623 

W5 

.1988 

.2383 

.2968 

.4213 

.5244 

.6631 

M6 

.1965 

.2358 

.2940 

.4128 

.5252 

.6636 

WMR 

.2297 

.2686 

.3279 

.4317 

.5418 

.6765 

A5 

1.279 

1.523 

1.909 

2.587 

3.339 

4.461 

A6 

1.273 

1.504 

1.881 

2.572 

3.197 

4.156 

Null  Distribution  Parameters  Estimated 

Alpha  Level 


Statistic 

o 

CN 

• 

.15 

.10 

.05 

.025 

.01 

D5 

.05289 

.05714 

.06325 

.07253 

.08125 

.09205 

D6 

.05660 

.06117 

.06748 

.07660 

.08682 

.09707 

DMR 

.1120 

.1178 

.1252 

.1385 

.1494 

.1625 

W5 

.02788 

.03293 

.04078 

.05513 

.07074 

.09445 

W6 

.03094 

.03655 

.04480 

.05850 

.07518 

.09842 

W4R 

.07716 

.08507 

.09728 

.1194 

.1419 

.1678 

A5 

.1999 

.2376 

.2998 

.4358 

.5973 

.8912 

A6 

.2175 

.2562 

.3186 

.4448 

.6115 

.9352 
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TABLE  A3 . 9 

CRITICAL  VALUES — EXTREME  VALUE  DISTRIBUTION — 
SAMPLE  SIZE  40 


Null  Distribution  Completely  Specified 


Alpha  Level 


Statistic 

o 

<N 

• 

.15 

.10 

.05 

.025 

.01 

D5 

.1081 

.1176 

.1321 

.1531 

.1679 

.1850 

D6 

.1098 

.1206 

.1348 

.1542 

.1693 

.1869 

DM* 

.1507 

.1623 

.1762 

.1953 

.2124 

.2309 

W5 

.1974 

.2406 

.2960 

.4171 

.5250 

.6448 

W6 

.1969 

.2398 

.2957 

.4152 

.5133 

.6365 

wm 

.2331 

.2735 

.3401 

.4477 

.5476 

.6613 

A5 

1.176 

1.398 

1.764 

2.398 

3.028 

3.799 

A6 

1.186 

1.414 

1.754 

2.367 

3.022 

3.826 

Null  Distribution  Parameters  Estimated 

Alpha  Level 


Statistic 

.20 

.15 

.10 

.05 

.025 

.01 

D5 

.04923 

.05265 

.05720 

.06406 

.07202 

.07997 

D6 

.05134 

.05524 

.06006 

.06870 

.07629 

.08472 

DMR 

.1008 

.1059 

.1130 

.1242 

.1336 

.1455 

W5 

.03104 

.03627 

.04378 

.05671 

.07083 

.09323 

W6 

.03443 

.03922 

.04729 

.06188 

.07814 

.09916 

VMR 

.08026 

.08938 

.1025 

.1234 

.1428 

.1676 

A5 

.2109 

.2503 

.2995 

.4034 

.5309 

.7445 

A6 

.2296 

.2648 

.3236 

.4309 

.5654 

.7817 

TABLE  A3.10 


CRITICAL  VALUES— EXTREME  VALUE  DISTRIBUTION — 
SAMPLE  SIZE  50 


Null  Distribution  Completely  Specified 

Alpha  Level 

Statistic 

.20 

.15 

.10 

.05 

.025 

.01 

D5 

.09797 

.1067 

.1181 

.1363 

.1530 

.1727 

D6 

.09998 

.1092 

.1199 

.1376 

.1555 

.1757 

DMR 

.1385 

.1479 

.1590 

.1769 

.1933 

.2153 

W5 

.2042 

.2425 

.3032 

.4239 

.5267 

.6965 

W6 

.2038 

.2447 

.3002 

.4242 

.5226 

.6935 

VMR 

.2433 

.2788 

.3440 

.4537 

.5596 

.7183 

A5 

1.173 

1.403 

1.733 

2.345 

2.978 

3.813 

A6 

1.187 

1.420 

1.744 

2.343 

2.969 

3.780 

Null  Distribution  Parameters  Estimated 

Alpha  Level 

Statistic 

o 

CM 

• 

.15 

.10 

.05 

.025 

.01 

D5 

.04586 

.04870 

.05278 

.05896 

.06472 

.07132 

D6 

.04669 

.04976 

.05413 

.06058 

.06797 

.07452 

DMR 

.09065 

.09508 

.1014 

.1110 

.1185 

.1295 

W5 

.03198 

.03692 

.04404 

.05700 
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Power  Comparisons 


Tables  A4 . 1  through  A4.12  list  the  results  of  power 
comparisons  made  using  the  normal  and  extreme  value  dis¬ 
tributions  in  the  null  hypothesis.  Tables  are  listed  by 
null  distribution  type  (normal  or  extreme  value) ,  null 
hypothesis  type  (completely  specified  or  parameters  esti¬ 
mated)  and  alpha  level  (.10,  .05,  or  .01).  Each  table 
includes  eight  distributions  as  alternative  hypotheses  and 
five  different  random  sample  sizes  (four  for  the  Cauchy) . 
All  entries  represent  the  number  of  samples  significant  at 
the  given  alpha  level  from  a  Monte  Carlo  sample  size  of 
1000  trials.  Actual  power  of  each  test  may  be  obtained  by 
dividing  each  entry  by  1000. 
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Appendix  5 

Computational  Methods  Used 

This  appendix  describes  various  numerical  methods 
used  throughout  this  study.  In  particular,  we  will 
describe  methods  for  random  variate  generation,  numerical 
integration,  and  iterative  solution  for  inverting  the 
approximated  distribution  function.  All  calculations  were 
performed  using  a  CDC  Cyber  74/750  system  located  at  the 
Aeronautical  Systems  Division  Computer  Center,  Wright- 
Patterson  Air  Force  Base,  Ohio. 

Generating  Random  Variates 

Depending  on  the  underlying  distribution,  random 
variates  were  generated  from  two  main  sources.  Uniform 
random  variables  were  constructed  using  the  multiplicative 
congruential  generator  described  by  McGrath  and  Irving 
(Ref  54) .  Random  samples  from  the  double  exponential, 
exponential,  triangular,  and  extreme  value  distributions 
were  generated  by  applying  the  corresponding  inverse  proba 
bility  integral  transform  to  a  set  of  uniform  random  vari¬ 
ates.  Random  samples  from  the  four  parameter  X  family  of 
Rambert,  et  al.,  were  generated  by  transforming  uniform 

random  variates  using  the  percentile  function  R(p)  = 

^3  ^4 

X1  +  [p  -(1-p)  ] / A 2  where  the  X^,  i=l,...,4  are  the 
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parameters  of  the  specific  A  distribution,  and  p  is  a  uni¬ 
form  random  variate  on  [0,1]  (Ref  72)  .  Subroutines  from 
the  International  Mathematical  and  Statist ical  Libraries 
were  used  to  generate  random  samples  for  the  normal  (using 
the  polar  method)  Weibull,  gamma,  beta,  and  Cauchy  dis¬ 
tributions.  If  necessary,  location  and/or  scale  transforma¬ 
tions  were  applied  to  adjust  standard  variates  to  specific 
underlying  populations . 

Numerical  Integration 

Two  specific  procedures  used  for  evaluating  the 

f  b 

finite  integral, J  f(x)  dx,  were  Gaussian  quadrature  and 
Simpson's  rule.  Initially,  in  determining  the  variables 
for  the  nonparametric  estimators,  a  sixteen  point  Gauss- 
Legendre  quadrature  scheme  was  used  for  the  following 
integrands 

1.  (F(x)-SF(x) ) 2  sf  (x) 

2.  (f (x)-sf (x))2  sf (x) 

Quadrature  points  and  weights  were  taken  from  tables  in 

reference  1,  page  916.  The  interval  of  integration  was 

the  support  of  the  nonparametric  estimate  [X  .  X  ] . 
c  min,  max 

To  evaluate  the  integrals  used  for  comparisons 
of  approximate  mean  integrated  square  error  for  both  dis¬ 
tribution  and  density  functions  and  the  integrals  used  in 
calculating  the  goodness  of  fit  statistics,  we  used  a 
modified  Simpson's  rule  with  error  control  (Ref  66).  Given 
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an  ordered  sample  of  size  n  and  the  two  endpoints  of  the 

support  of  the  nonparametric  approximation,  we  constructed 

n+1  intervals  of  the  form  tX(i) 'X(i+i) 1  i=0,...,n  where 

X,rtl=X  .  and  X.  ,, v=X  For  each  integrand,  we  used 

(0)  min  (n+1)  max  * 

Simpson's  rule  on  each  interval.  If  the  summed  value  of 
the  approximation  was  not  sufficiently  close,  we  divided 
each  interval  in  half  and  repeated  the  procedure.  Inte¬ 
grands  evaluated  by  this  method  included: 

1.  (F(x) -SF(x) ) 2  sf  (x) 

2.  (f (x)-sf (x) ) 2  sf  (x) 

3.  (F(x)-SF(x)  )  2  sf  (x)/[SF(x)  (l-SF(x)  )  ] 

4.  sf(x) 

A  stopping  criterion  for  integral  convergence  was 

selected  based  on  the  construction  of  our  nonparametric 

density  estimate.  We  know  that  Jsf(x)  dx  =  1  on  (X  ,X  ] 

min  max 

We  also  know  that  the  underlying  distribution  function  F 
and  density  function  f  are  reasonably  smooth.  By  using 
subintervals  based  on  the  data  points,  we  should  be  able 
to  detect  any  "spikes"  in  the  integrands.  Using  this 
information,  we  used  as  the  approximation  to  each  integral 
the  value  of  the  Simpson's  rule  calculations  when 
| sf (x)-1.0 |<0.01.  Since  sf(x)  is  the  "noisiest"  contribu¬ 
tion  to  the  four  integrands,  approximating  /sf(x)  dx  to 
a  sufficient  degree  gives  us  a  measure  of  confidence  in 
the  remaining  integral  approximations. 
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To  see  numerically  how  the  choice  of  stopping 

/ 

“  ‘  •* 

criterion  affected  the  other  integrals,  we  generated 
twenty-five  random  samples  of  size  100  from  the  standard 
normal  distribution.  Then  we  calculated  the  modified  CVM 
integrals  for  both  the  distribution  and  density  functions 
as  well  as  the  integral  of  the  density  function  approxi¬ 
mation  using  all  six  nonparametric  models.  We  used  two 
different  stopping  criterion  values,  |J*sf(x)  dx-1.0  |  <ERR 
where  ERR  =  0.01  or  0.001.  Table  A5.1  lists  the  average 
values  of  the  integrals  for  the  twenty-five  samples.  Each 
entry  corresponds  to  a  specific  model  approximation,  inte¬ 
grand  and  choice  of  ERR.  A  comparison  between  the  entries 
corresponding  to  ERR  choices  of  0.01  and  0.001  for  each 
integrand  shows  that  a  tighter  bound  on  the  integral  of  the 
density  approximation  has  a  negligible  effect.  The  conver¬ 
gence  error  criterion  was  then  set  at  0.01. 

To  evaluate  the  integrals  associated  with  the  loca¬ 
tion  parameter  estimates  of  Chapter  VI,  we  again  used  a 
modified  Simpson's  rule.  We  divided  the  support  into  sub¬ 
intervals  using  the  data  points  as  before.  However,  since 
we  only  needed  one  integral  evaluated,  we  chose  a  straight¬ 
forward  application  of  Simpson's  rule  with  error  control. 
The  integral,  fx  st(x)  dx,  was  said  to  converge  when  the 
change  in  the  approximation  was  less  than  0.1  percent. 
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TABLE  A5.1 


Iterative  Solution  for 
Inverting  the  Approximated 
Distribution  Function 

To  calculate  the  pseudosample  points  for  the  smooth¬ 
ing  routine  or  to  calculate  any  percentile,  such  as  the 
median,  we  needed  a  method  for  inverting  the  sample  dis¬ 
tribution  function.  Since  we  can  calculate  the  density 
function  at  any  point  a  Newton  Raphson  iteration  scheme 
was  employed.  The  nth  approximation  x^  was  calculated 


as  x(n)=x(n_1)  -  SF(x(n“1) )/sf (x(n_1) ) .  Convergence 


was 


defined  when  the  absolute  value  of  the  difference  between 
successive  approximations  was  less  than  10-5  (Ref  66) . 
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Appendix  6_ 

A  Finite  Support  Modification  to  Insure 
Inclusion  of  All  Original  Data  Points 

For  either  an  extremely  leptokurtic  or  platykurtic 
distribution,  the  smoothing  routine  sometimes  generated  a 
pseudosample  for  which  the  support  of  the  nonparametric 
distribution  function  did  not  contain  the  interval 

where  x^)  an^  X(n)  are  extreme  order  sta¬ 
tistics  of  the  original  sample.  To  insure  that  the  inter¬ 
val  [X  .  ,X  ] ,  the  support  generated  by  the  pseudosample, 
min  max 

the  following  algorithm  was  added.  If  X  .  ,  the  lower  end- 

min 

point  of  the  finite  support  based  on  a  pseudosample,  is 
greater  than  Xj^,  the  smallest  order  statistic  of  the 
original  sample,  replace  the  inversion  point  of  the  pseudo¬ 
sample  determined  by  FS-1(G^)  by  X^j,  and  similarly  for 

X  less  than  X,  ..  This  modification  uses  the  informa- 
max  (n) 

tion  that  the  distribution  function  is  defined  over  at 
least  the  set  tX(u'X(n)}»  and  also  only  adds  enough  tail 
weight  by  adjusting  the  pseudosample  to  insure  that  the 
final  support  contains  the  original  data  points. 

The  above  modification  was  used  for  all  models 
except  Model  3.  Since  Model  3  uses  fixed  X^j  and  X^n+^ 
extrapolation  points  for  all  subsamples,  we  merely  set 
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X  .  =  X/n.  and/or  X_,„  =  X.  where  X._.  and  X.  . ,  > 

min  (0)  max  (n+1)  (0)  (n+1) 

were  the  extrapolation  points  based  on  the  entire  sample, 

whenever  the  interval  [X  .  ,X_  1  did  not  contain 

min  max 

^X(i)'X(n)3»  This  again  insured  that  the  final  distribu¬ 
tion  function  approximation  was  defined  over  a  finite  sup¬ 
port  which  contained  all  of  the  data  points. 
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